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INTRODUCTION 



I . 

The purpose o-f. silicon compilation is to allow -faster 
design o-f integrated circuits- Silicon compilation -frees the 
designer -from the basic layout, routing, and circuitry 
concerns inherent to integrated circuit design. The MacPitts 
silicon compiler does this by designing an integrated 
circuit chip -from a behavioral spec i -f i cat i on input. 

Previous work at the Naval Postgr aduate School 
investigated applications o-f the MacPitts silicon compiler 
to design o-f pipelined digital adders CRe-f. ID and 
multipliers CRe-f- 23- Work by Froede CRef, 33 showed the 
limitations o-f MacPitts, in its inability to produce fast 
VLSI chips- This de-ficiency is due primarily to the layout 
scheme (circuit structure) which MacPitts uses- 

This thesis investigates the i nterrel at i onshi p between 
MacPitts algorithmic syntax and resulting circuit structure- 
MacPitts partitions the chip -f unct i onal 1 y as shown in Figure 
1-1- The data path is at the top, and per-forms numerical 
operations and comb i nat i onal logic tests- The control path 
is at the bottom, and per-forms decisions which direct data 
path operations- 

Chapter II considers combi nat i onal logic in both the 
data path and control path- The ef-fects o-f syntax on 
comb i nat i onal logic structures are investigated 




Figure 1,. 1 MacPitts Chip Functional Block Diagram 



qualitatively, and i net t i ci enci es and limitations o-f 

implementation are noted. The basic data path organelles 
<t undamental combi nat i onal logic structures) are also 

i nvesti gated . 

Chapter III is a quantitative treatment o-f -functionally 
equivalent circuits in the data path and control path. A 
tive-input AND gate is created in both the data path and 
thecontrol path, and a comparative analysis is per-formed. 
The results are extended to similar data path combi nat i onal 
logic structures. 

Chapter IV investigates MacPitts sequential logic. A 
Gray code-to-bi nary serial decoder is designed, and a 
functional analysis is per-formed. The relationship between 
syntax and circuit structure is emphasized, with an 
alternate solution considered. A blackjack game chip is 
presented as a more elaborate MacPitts -finite state machine 
(PGM), and its structure is contrasted to that o-f the Gray 
code decoder. The Mead-Conway hi ghway--f armroad traffic light 
controller CRe-f. 4:p.81 1 problem is solved v^ith a 

MacPitts design, and an alternate solution is offered. 

Chapter Vis a quantitative comparison of a MacPitts 
design with a handcrafted equivalent. The Mead“Conway 
traffic light controller design from Chapter IV is compared 
to a computer-ai ded engineering (CAE) -desi gned variant, 
which has a programmed logic array (PLA) PGM. The designs 



are compared for speed, size, and power comsumption. 



Chapter VI is a design example. A design cycle for 
MacPitts is developed, and illustrated with the Hamming 15/4 
error detector /corrector CRef. 5H. The prototype (first 
model) and archetype (chief model) algorithms and chip 
layouts are provided- An analysis of the alternate designs 
is given, and a basis for choosing the archetype is 
proposed- The Hamming 15/4 error detector /corrector is then 
designed based on the archetype, and analyzed with available 
CAD tools. 

Chapter VII is a summary of errors detected in the 
MacPitts silicon compiler and suggestions for enhancement- 
The errors and suggestions are cross-r ef er enced to MacPitts 
source code where possible- 



I I . COMBINATIONAL LOGIC STRUCTURES IN THE MACPITT5 

SILICON COMPILER 

Inasmuch as the MacPitts algorithm creates combi nat i onai 
logic -functions, it would be help-ful to knov*^ how it does 
this. Does there exist an explicit directive to the LISP 
object -file which calls and implements the logical functions 
requested, or are they implicitly specified? If the latter 
is true, it would suggest simpler source algorithms could be 
written to specify the circuit function. If the former case 
IS true, then more lengthy algorithms are required, but the 
circuit designer has more latitude for direct control and 
optimization of layout- 

A. COMBINATIONAL LOGIC CIRCUITS IN THE DATA PATH 

Combi nat i onal logic structure i nstan t i at i on in the d<-.-<ta 
path of a MacPitts generated chip is directed by the data-- 
path. lisp file in the MacPitts source code. Ihe data- 
path.lisp file calls specific functional units cal led 
organelles from the organel 1 es - 1 i sp file to impleement hhe 
desired logic. These LISP files are compiled under the Liszt 
compiler and linked to the rest of the compiled MacPi tts 
files by the available Mat’.efile routine. The resulting 1«6 
Megabyte binary image constitutes the integrated MacPitts 
silicon comp i 1 er . 



.1 . 



The Basi c Chip Frame 



1 . 

The initial investigation consisted o*f the 

MacPi tt s-generated design frame called wire. mac. The 

algorithm to create this structure is shown in Figure 2.1. 
;WIRE.MAC 

;SOURCE CODE FOR ALGORITHMIC CREATION OF NO 
;FUNCTION BY MACPITTS SILICON COMPILER. 

(program vire 1 

(def 1 ground) 

(def aln port Input (2)) 

(def res port output (3)) 

( def 4 ph 1 a ) 

( def 5 ph i b ) 

( def 6 ph t c ) 

(def 7 power) 

( a 1 ways 

( setq res aln)))) 

Figure 2.1 Wire. mac 

The extension .mac refers to a MacPi tts algorithm. MacPi tts 
is taken to refer to the silicon compiler, the psuedo-LISP 
language which it uses, and the LISP source routines which 
constitute the silicon compiler. To avoid confusion, the 
MacPi tts driver routines written by the chip designer will 
be referred to as algorithms. Other meanings of the term 
MacPitts will be clarified by context. 

MacPi tts produces a seven pad chip, routing the 
input directly to the output without clocking. The three 
phase clocking is not required for this circuit, so the 
clock runs all terminate within the chip frame without 
connections as shown in Figure 2.2. The three phase clock 
must be specified in the algorithm, however, and the clock 
traces are produced whether they are used or not. Note that 
the pads are placed around only three sides of the chip, 

13 
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and the clock pads are also placed in the order specified in 
the driver algorithm (Figure 2.1). Furthermore, neither the 
clock traces nor the signal lines takes a direct route to 
its destination. Even though these lines are all metal, the 
excess lengths induce a lessening of maximum chip speed due 
to capacitance. This topic will be treated in a later 
chapter. The data path Vdd-ground comb does not connect with 
the Vdd rai 1 at bottom left on the stipple plot. This is 
common with very small data path chips, and the error can be 
corrected in Caesar or a similar VLSI graphics editor. 

2. A Data Path Inverter 

The next program, macnot.mac shown in Figure 2.3, 
specified a logical NOT function. As expected, MacPitts used 
a single inverter of 4:1 ratio in the data path. The input 
which is on the top left diffusion line in Figure 2.4 runs 
to the gate of the NMOS inverter via a metal and diffusion 
routing, and the inverted output comes out on a polysilicon 
line from the far right of the circuit. It was also noted 
that the logical integer specification is required for NOT, 
i.e. , one must use Cword-notl rather than Cnotl. The reason 
for this is given in Southard CRef. 6:pp. 47-483, which 
indicates that integer logical operators must be used on 
word elements, (ports and registers), and Boolean logical 
operators on control elements (flags and signals). The 
logical Boolean specification CnotJ is used on flags, input 
signals, and internal signals but it is not used for input 

15 



ports or register contents- In either Boolean or integer 
data types, the NOT -Function takes a single value, as would 
be expected- j 

The syntax o-F the driver algorithm (the -mac -File) 
is data-type sensitive, in a similar manner as Fortran is 
sensitive to the integer and -Floating point data types. The 
two data types (-From the programming perspective) are 
Boolean and integer- Each data type is treated d i -F -f er ent 1 y 
by the MacPitts compiler, and each requires a different 
syntax for the equivalent function- An example will clarify 
this distinction: 



FUNCTION 

NOT 

NOT 

AND 

AND 



DATA TYPE 

Bool ean 
i nteger 
Bool ean 
i nteger 



ALGORITHM 

(not a) 

( wor d-not 
(and a b) 

( word-and 



C STATEMENT 
a ) 

a b ) 



The fundamental difference in data types is 
argument length- Boolea^n data are of single bit length, 
whereas integer data are of word length (one bit or 
greater)- Integer type data operations all occur in the 
data path of a MacPitts design, and Boolean operations all 
occur in the control path- 

In Figure 2-3, the data type is declared in the 
DEF statement, the form of which is 

(def <name> <function> <input, output, or internal.- 
< p i n number ( s ) > ) 



i 



;MACNOT.MAC 

;SOURCE CODE FOR ALGORITHMIC CREATION OF LOGICAL 
;<not> FUNCTION BY MACflTTS SILICON COMPILER 
(program macnot 1 

(def 1 ground) 

( def a port input (2)) 

;a 1nput//rea« output 
(def b port output (3)) 

( def 4 ph ! a ) 

( def 5 ph 1 b ) 

(def 6 phic) 

;muat show 3*phs elk, even if not used 

(def 7 power ) 

( a 1 ways 

(setq b (word-not a)))) i 

Figure 2.3 Macnot.mac 
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Figure 2.4 Data Path Inverter 
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where the name is any ASCII character string, the -function 
can be either port, signal, register, or flag- The next 
field determines where the data is applied, and for most 
circuits is either input or output- The pin number is 
required for all input and output data. The data t>'pe is 
determined by the function field- Signals and flags are 
Boolean data, ports and registers are integer (word length) 
data- The subsequent MacPitts forms in the driver algorithm 
must agree in type with the DEF decl arati ons- 

If an incorrect data type spec i f i cat i on is used, 
MacPitts generates an appropriate error diagnostic at 

compilation time- For instance, if one were to define the 
inputs hot and cold as Boolean type and attempt integer- 
operations on them as follows 

(def hot signal input 5) 

(def cold signal input 6) 

(setq warm (word-nor hot cold)) 



the following diagnostic would result at compilation times 

Error : 1 ogi cal coercion to integer not implemented 
yet 

Similarly, if Boolean operations are attempted on integer 
d^ita, the following diagnostic results at compilation times 



Err or s Bool ean conversion not implemented yet 



MacPitts error diagnostics can be quite contusing 
to the i neKper i enced user. It is suggested that one peruse 
the d i ncol n . 1 i sp , hl.grep, and compmesg . 1 i sp tiles ot the 
MacPitts source code to gain insight into the cause ot 
specitic diagnostic messages. Thi s can be easily done on-line 
under the BSD Unix operating system. The grep teature 
(pattern search and recognition) is used. The general 
command tormat is 

grep < search pattern > <tile to search>. 

For example, it one attempted Boolean operations on a 
register (an i nteger-val ued data type) in MacPitts, the 
second diagnostic given above would result. To loc^^.te the 
source ot this message, change directory to the ressidence ot 
MacPitts source code and issue the Unix command 
grep boolean 



to . locate all occurrences ot the word 
advised in issuing the grep command- 
is searched tor, the search may take q 
the results may not be very helptul. 
ot the grep command is limited though, 
BSD Unix manual . 



boolean. Caution is 



1+ a 


very common 


word 


u i t e 


a long 


wh i 1 e 


n and 


The 


search 


c a p a b 


1 1 3 t V 


as 


explained in 


t h e 



3. A Data Path OR Gate 

Next a MacPitts routine was written to generate a 
two input OR gate in the data path. Again, the integer dat£^ 
speci t i cat i on is required (see Figure 2.S). 



;MACOR.MAC 

;SOURCE CODE FOR ALGORITHMIC CREATION OF LOGICAL 

;<or> FUNCTION BY MACPITTS SILICON COMPILER//2 Input gate// 

(program macor 1 

(def 1 ground) 

(def a port Input (2 >) 

; a *b=»1nputs//c» output 
(def b port input (3)> 

(def c port output (4 >) 

( def 5 ph 1 a ) 

( def 6 ph 1 b > 

( def 7 ph i c > 

( def 8 power > 

(always 

(setq c ( word-or a b ) ) ) ) i 

Figure 2.5 Macor. mac 



The resulting circuit extractecJ from the chip is (depicted 



in Figure 2.6. The OR function is implemented as a NOR gate 



followed by an inverter. Figure 2.7 shows the gate 



equivalent of a two input data path OR structure. The two 



inputs to the NOR gate come in on the left top of the 



circuit, the output is then inverted to yield a logical OR 



function, and the output of the inverter is routed from the 



left back out on the poly line below and parallel to the 



input tracks. This routing scheme (river routing) is 



determined by the MacPitts source code, and the chip 



designer has no control over it. All chip inputs and outputs 



are routed inside the main ground bus, with little regard to 



minimizing trace length (see Figure 2.2). So an OR gate in 



the data path of MacPitts is constructed from a two input 



NOR gate with an inverter on the output, and the inputs and 



outputs all connect the data path from the left side. 




F i gur e 



2.6> Data Path OR Gate 



F i gure 
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4. 



A Data Path NQR Gate 



A two input data path NOR -function is shown in 
Figure 2.8. The resulting circuit in Figure 2.9 shows 
instantiation as a two input 8:1 NOR gate, with the inputs 
A, B, at top le-ft and the result, C, at bottom le-ft. I-f two 
inputs are permissible, are more? Does NacPitts know to 
adjust the transistor k values for multiple input gates? A 
two input NOR chip was specified in the algorithm, and 
MacPitts created a two input NOR gate. So explicit circuit 
specification has been realized so far in the MacPitts chip 
data path. When the algorithm specifies a NOR function, a 
NOR gate is instantiated. As will be discussed later, this 
is not the case in the control path. 



; MAC NOR .MAC 

;SOURCE CODE FOR ALGORITHMIC CREATION OF LOGICAL 

;<nor> FUNCTION BY MACPITTS SILICON C0MPILER//2 Input gate// 

(program macnor 1 

(def 1 ground) 

(def a port input (2 )) 

;a,b*1nputs//c*output 
(def b port input (3)) 

(def c port output (4 )) 

( def 5 ph i a ) 

( def 6 ph i b ) 

( def 7 ph i c > 



;must show 3-phs elk, even If not used 
( def 8 power ) ; 

(always 

( setq c ( word-nor a b ) ) ) ) ) 



Figure 2-8 Macnor.mac 




Figure 2.9 Data Path NOR Gate 



5 . A Four Input NOR St ructure In The Data Path 

Figure 2.10 shows the MacPitts algorithm to 
generate a four input NOR structure (not the functional 
equivalent of a four input NOR gate) in the data path. The 
MacPitts form used was 

(setq out (word— nor a(word— nor b (word-nor c d))) 
where setq is the LISP assignment operator, out is the 
output port, a,b,c,and are the inputs, and all data is of 










integer (word) type- The pre-f i x-operator nature of LISP 
syntax CRet- 6:p- 473 indicates the logical operation which 
this gate will per-form- Figure 2-11 shows the layout ot the 
circuit MacPitts produces -from this algorithm, and Figure 
2-12 depicts the gate-level equivalent- 

Note the topology, two inputs to the first NOR 
gate, its output and another input to the next NOR gate and 
repetition to the third level- The output comes from the 
last (rightmost) NOR gate- 

This structure will not be the functional 
equivalent of a four input NOR gate. As the LISP-1 ike syntax 
suggests, the NOR of four inputs is not equivalent to the 
cascading of two input NORs» 



I 



;FOURNOR .MAC 

;SOURCE CODE FOR ALGORITHMIC CREATION OF LOGICAL 

;<nor> STRUCTURE BY MACPITTS SILICON COMPILER//4 Inputs// 

(program fivnor 1 

(def 1 ground) 

(def a port Input (2)) 

(def b port Input (3)) 

(def c port Input (4)) 

(def d port Input (5)) 

(def e port Input (6)) 

(def outr port output (7)) 

{ def 8 ph 1 a ) 

(def 9 phib) 

( def 10 ph I c ) 

{ def I 1 power ) t 

(always 

(setq outr 

(word-nor a(word-nor b(word*nor c d)))))) 

Figure 2.10 Fournor.mac 




Figure 2.11 Data Path Fournor Circuit 



D. 

C. 

B“ 





Figure 2.12 Gate Equivalent of Fournor Circuitry 




6 . 



A D^^t a Path AND Gate 



These observations raise the question of how a two 



input data path AND gate would be constructed by MacPitts- 
The (word-and :< y) integer expression is required to 



implement this circuit al gor i thmi cal 1 y , and a reasonably 



compact circuit is expected- Figure 2.13 shows the MacPitts 



algorithm to create the two input bit AND function in the 



data path. 



;MACAND.MAC 

;SOURCE CODE FOR ALGORITHMIC CREATION OF LOGICAL 

;<AND> FUNCTION BY MACPITTS SILICON C0MP1LER//2 Input gate// 

{program macand 1 

{ def I ground ) 

{def a port Input (2 )) 

(def b port input (3)) 

<def c port output (4 )) 

( def 5 ph 1 a ) 

{ def 6 ph i b ) 

(def 7 phic) 

{ def 8 power ) 

(always 

(setq c ( word-and a b ) ) ) ) 



Figure 2.13 Macand.mac 



The AND chip is implemented as a two input 4:1 NAND 



gate, the output of which drives a 4:1 inverter. The 



stipple plot of this circuit is shown in Figure 2.14, and 



its g^te level equivalent is shown in Figure 2.15. In 



Figure 2.14 note the input similarities to the previous 



circuits. The two inputs enter the organelle at top left, 



the signal is routed to the gate, and the output exits the 



organelle on the bottom polysilicon line at the left. Also 







Figure 



14 Data Path AND Gate 
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Fi gure 



15 Gate Equivalent o-f Data 






Path AND 6ate 










note the ditterence among layouts ot the MacPitts NAND gate 
and the MacPitts NOR gate, and the correspond! ng Mead-Conway 
cells CRe-F. 4:p. 17D. 

7. A Three Input AND Str uctur e In The Data Path 

The three input AND was expected to produce gates 
similar to those o-F the two input AND, a series o-F cascaded 
NAND gates each -Followed by an inverter. Figure 2.16 shows 
the algorithm -For the three input AND circuit, and Figure 
2-17 depicts the resulting layout. The circuit is the 
equivalent o-F three ANDs due to associativity o-F AND. 

8- Data Path Basi c Qrqanel 1 es 

When a MacPitts source algorithm is invoked by the 
linked binary MacPitts image by issuing the command 

macpi tts <-F i 1 ename> <opti ons> 

LISP object code is generated (unless the noobj option is 
speci-Fied, in which case MacPitts searches -for a previously- 
created object -File o-F <f i 1 ename>. ob j ) . In the -Fi 1 ename- ob j 
-File it is observed that the data path logical operations 
are all derived -From NOT, NAND, and NOR LISP operations. 
This is due to the -Fundamental hardware building blocks o-F 
MacPitts data path comb i nat i onal logic being two input NAND 
and NOR gates, and NOT gates (inverters). Knowing this, the 
reason -For the two-input gate implementation as depicted in 
the previous -Figures becomes clear. 



Any data path logic organelle is composed ot these 
primitives. The OR organelle is a NOR gate with an inverter 
on its output. • The AND organelle is a NAND gate with an 
inverter on its output. In the data path, these organelles 
are assembled into macros in the organel 1 es. 1 i sp tile ot the 
MacPitts source code. The process ot silicon compilation is 
thereby shortened, since some ot the constituent parts are 
already put together. 

A two input data path NAND gate chip is implemented 
exactly as it is specified. A three input NAND structure is 
implemented as expected, by cascading two NAND orgenell es 
(the three input NAND structure is not functionally 
equivalent to a three input NAND gate). The output, aqa\in, 
is v-jhat the LISP par enthesi z ed notation would lead one to 
expect . 

9 . Bi t Slice Combi nat i onal Log i c 

So tar, all examples given have used inputs having 
one bit, but the data type speci t i cat i on tor data path 
combinational logic is integer. Word size data inputs are 
treated in the expected way. Figure 2.16 illustrates a 
routine which performs the logical AND on two input vectors 
each tour bits wide. Notice the similarity of this hacPitts 
program to those already given. The only differences between 
this routine and the AND of two bits are the PORT 
statements, which make logical and connective assignments 
between i/o ports and inter-chip hardware blocks. 



;3AN0.MAC 

;SOURCE CODE FOR 3 INPUT DATA PATH <AN0> GATE 
(program 3and 1 

(def 1 ground) 

(def a port input i2)) 

(def b port Input (3)) 

(def c port Input ( 4 )) 

(def d port output (5>) 

( def 6 ph f a > 

{ def 7 p h 1 b ) 

{ def 8 p h 1 c > 

{ def 9 power ) 

< a 1 ways 

< setq d (word-and (word-and a b) c )>)) 

Figure 2.16 3and.mac 
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Figure 2.17 Circuitry from 3and.ci-f 



Figure 2-18 illustrates the data path circuitry 
which implements this logic- It is evident that the logic is 
performed by replications of the fundamental MacPitts AND 
organelle, a NAND gate with inverted output. In comparing 
this circuit to Figure 2.14 the similarity becomes clear- 
The word— and integer operation as specified in the source 
algorithm translates to a data path AND organelle in the 
LISP object file- This organelle is replicated, 
i nstant i ated , and connected to inputs and outputs to create 
the circuit (cifplot) shown in Figure 2-19- This data path 
word operation capability would not usually be applied to 
bit-width combi nat i onal logic, as the previous discussions 
might suggest, but rather to bit-slice operations such as 
word masking, parity checks, arithmetic operations, and so 
on - 

Two Data P ath Chi ps: Counter's 

A four bit resettable up-counter chip was designed 
by MacPitts using an algorithm given in the MacF-'itts 
docLimentat i on . Figure 2-20 shows the algorithm to specif v 
the counter's behavior, and Figure 2-21 shows the resulting 
chip layout diagram. This example gives an indication of the 
implicative nature of MacPitts, which is actual Iv a function 
of the LISP object code- There is a bank of three vertical 
drivers below the data^ path block in Figure 2-21- These are 
clock drivers, which drive the three phase clock. 
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;MULTIAND.MAC 

;SOURCE CODE FOR ALGORITHMIC CREATION OF LOGICAL 
;<AN0> FUNCTION BY MACPITTS SILICON COMPILER 
(program multland 4 

(def 1 ground) 

(def a port !nput <234 5)) 

(def b port Input (678 9)) 

(def c port output (10 11 12 13)) 

( def 14 ph 1 a ) 

(def 15 phlb) 

( def 16 p h I c ) 

( def 1 7 power ) 

(always 

(setq c ( word-and a b ) ) ) ) 

Figure 2.18 Mul ti and . mac 








;Examp1e of MACPITTS algorithm to create a 4 bit counter 
; Illustrates use of "always* and "cond" commands 
;tftle: count4.mac 
{program count4 4 
( def 1 6 power ) 

{ def 1 ground ) 

(def 2 phfa) 

( def 3 ph f b ) 

( def 4 ph f c ) 

(def rst signal Input 5) 

(def count register) 

(def cnt_up signal Input 6) 

(def ld_rero signal Input 7) 

(def out port output (12 13 14 15) *) 

(always 

( cond 

( 1 d_zer o 

(setq count 0) ) 

(cnt_up 

(setq count (I-*- count)) ) ) 

(setq out count) ) ) 



Figure 2-2C) Count4.mac 

They connect to the clock lines on the bottom an(d to the 
count registers at the top- 

There is a small Weinberger array beneath the clock 
(drivers- A Weinberger array CRef. 8D is used by NacPitts to 
control data path operations- It can be interred from the 
size comparison between the data path block and the control 
block that this is a data intensive chip- The MacPitts 
algorithm reflects this, with many data operations such as 
SETQ and (1+ count), the increment statement, and tew control 
operations such as 



> <actions> --- 



< cond < < conditional 



) 




Figure 2.21 Count4.ci-f 



where each < conditional > requires a decision- This 
decision making is perhaps more obvious in the generated 
object -file, where each COND statement is translated to an 
IF statement. MacPitts implements the decisions more along 
the lines of a Pascal CASE construction than as an IF 
construction (the compiled LISP code reflects the IF logical 
testing, but it is set within a par al 1 el i 2 i ng command). 

The SETQ form has operated on just ports so far. In 
count4.mac, the SETQ form operates on a register (COUNT, the 
current counter value). The last line in the algorithm, 
(setq out count), sets the output port to the current count 
register value- From the hardware perspective, this can be 
viewed as a latching or storage of the register contentsM 
and clocking the contents to an output port- This is 
necessary in MacPitts since ports cannot store data- Only 
registers can store data in the data path, and MacPitts 
implements registers as mast er -si ave flip flops- 

The chips considered so far, with the exception of 
count4-mac, have been pure data path chips- In almost all 
useful chips, there will be a data path which is control lea 
by a Weinberger array control path- It is difficult to guess 
the relative sizes of the data path and control pa^th from 
just the MacPitts driver algorithm. Never thel ess , if few 
conditional decisions are to be made and many arithmetic or 
logical operations are to be performed, the da.ta path is 



likely to be the larger- 



Figure 2.22 shows the algorithm (the .mac tile) 
■for count 16ud . mac , the hiacPitts driver -for a 16 bit up/down 
counter. The signal and register names are sel-f expl anatorv'. 
The previous -four bit up-counter was the prototype for this 
16 bit up/down counter. The differences are in word length, 
the addition of a new input signal (count_down ) , the 
conditional test of count_down, and the decrement operation 
(1- count) if count_down is asserted true- It is usually a 
good idea to model a desired algorithm with a simpler 
prototype ( f unct i onal 1 y similar but having fewer inputs and 
outputs), and to test the prototype in the MacPitts command 
interpreter. For example, designing a four bit up counter is 
a good preliminary step when a 16 bit up/down counter is 
desi red . 

It can be inferred that the ratio of data path to 
control path size will be greater for this chip than for 
count4-mac. Figure 2.23 shows the resulting cifplot of 
count 1 6ud - mac , and the 16 bit wide data path is indeed much 
larger than the control path, and as expected, much larger 
than the four bit counter data path also. 



;Exampla of MACPITTS algorithm to create a 16 bit up/down counter 
;cop1ously commented for clar1ty*s sake 
;t1tlet coun t 1 6ud . mac 
(program countl6ud 16 

; note that the 16 opposite the title determines # of outputs 
;doc. says data paths; actually equates to output pads<NOT paths) 
;follow1ng 5 lines necessary every pgmi 
(def 1 ground) 

(def 2 phia) 

(def 3 phlb) 

.(def 4 phic) 

(def 25 power) 

;the counter will require a 16 bit width storage reg 1 ster ( McP« m/s FF) 
a count up enable signal, 

a count down enable signal, » 

and a reset signal. These are described syntactically belowi 
(def rst signal Input 5) 

;th1s declares a bank of 16 clocked m/s FFs (see stippleplot) 

(def count register) 

(def cnt_up signal Input 6) 

(def cnt_dn signal Input 7) 

(def ld_zero signal Input 8) 

;the 16 output pads are specified* 

(def out port output (9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24) ) 
;always command means to execute what follows every clock cycle 
( always 

;the cond (-Itlon) statement means to check the following guard 
;condlt1ons, and execute ONLY that one which Is .true. 

;execut1on of one guard precludes execution of any subsequent guards. 

( cond 

;there are three guards to check* Is Id^zero .true.? 

;1f not. Is cnt_up .true.? 

;1f not, Is cnt^dn .true.? 

; If neither Is .true, then exit the loop 
( ld_zero 

;ff ld_zero Is asserted (high), then make count*0 (1.e.,clr FFs) 

( setq count 0 ) > 

( cnt_up 

; If cnt_up Is asserted (high), then Increment the count FF bank 

(setq count (!♦ count)) ) 

;1f cnt^dn Is asserted (high), then decrement the count FF bank 
(cnt_dn 

(setq count (1- count)) ) ) 

;regardless of which (If any) operation Is done, the FF contents 
;are assigned to the output with the setq command. 

(setq out count) ) ) 



Figure 2.22 Count 16ud . mac 
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Fi gure 



2. 23 Count 16ud . ci -f 
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B. COMBINATIONAL LOGIC STRUCTURES IN THE CONTROL PATH 



The implementation o-f combinational logic in the control 
path ot a MacPitts design is -f undamental 1 y di-f-ferent -from 
its implementation in the data path. 

In the data path, all combi nat i onal logic is constructed 
■from basic two input NOR, NAND, and NOT cells, as described 
in the MacPitts source code -file data-path . 1 i sp . Any 
logical implementation, however complicated, is constructed 
from these three organelles (other organelles do e)-:ist in 
the organelles.l -file, but they all are constituted either 
from these basic cells or permutations of these cells). 

Furthermore, the specifications required by MacPitts in 
the data path are more oriented towards structure than 
behavior. For instance, when the programmer /desi gner writes 
the following algorithmic fragment 

(word— and a(word-and b c)) 

what is being explicitly specified is a two-level gate 
structure. The innermost level comprises a two-input AND 
gate, the output of which is fed to the input of the second 
level AND gate, in parallel with the third input. Note that 
a single gate with more than two inputs is not permitted in 
the data path. The syntax constraints of the MacPitts 
compiled object code determine this structure. Again, this 
apparent limitation is not really a limitation at all 
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because MacPitts is so constructed as to force decisions to 



be made in the control path. Consequently, the necessity of 
Boolean algebraic reduction in the data path combi nat i onal 
logic is highly unlikely. 

1 . Control Path Combi nat i onal Loqi c 

The control path implementation of comb i nat i onal 
logic is simpler than the data path implementation in two 
ways. It is behavior oriented, rather than structure 
oriented. The MacPitts designer needs only to specify the 
MacPitts LISP-1 ike behavior of the structure, and the 
MacPitts environment produces a realization of it- This 
requires little (if any) Boolean reduction which might be 
required for complicated data path logical structures- 

The control path comb i nat i onal logic is also 
simpler st ructural 1 y , in that it is always implemented in a 
hi ghl y-regul ar Weinberger array- A tradeoff between 
simplicity of layout and ma)(imum circuit speed exi.sts, 
however, and this topic will be considered in Chapters IV 
and V- Although a Weinberger array is geomet r i cal 1 y simpler 
than a Programmable Logic Array (PLA) , it is not as fast or 
as smal 1 . 

The selection of which path is to perform the 
combinational logic is inherent in the MacPitts (the 
language) syntax. If the logical operator is a Boolean form 
and its antecedents are signals or flags, the control path 

If the logical operator is an integer 



4 \ 



will do the 1 ogi c - 



form and its antecedents are ports or registers, 



then the 



comb i nat i onal logic will take place in the data path. Thus, 
the syntax drives the selection of where the combinational 
logic occurs. 

The initial MacPitts documentation offered some 
insight into these distinctions. A variety of tests were 
devised in the current investigation to explore the 
combi nat i onal logic implementation differences between the 
data and control paths. The experiments designed to arrive 
at the above conclusions for the control path logic are 
presented in the following sections. 

2 . A Control Path AND Gate , And Contr ol Path 
Syntax 

Casand.mac (cascaded AND gates^ Figure 2.24 ) was 
the algorithm to create the initial structure to explore 
combi nat i onal logic implementation in the control path. The 
control path implementation of comb i nat i onal logic requires 
a different kind of input declaration than does the data 
path. In the control path, the inputs must be declared as 

< name > si qnal i nput < p i n number > 

This has the effect of coercion to Boolean (true or false, 
as opposed to one and zero) in the MacPitts environment- 

consequent 1 y , a different type of logical operator 
is required in the SETQ argument forms. In the data path, 
using def i ned-i nt eger ports as inputs, the integer logic 



SETQ -forms are used (word-or, word-nand, 



etc ) . 



In the 



control path, however, Boolean SETQ -forms are required (or, 
nand , etc-)- The data path integer SETQ -forms are limited to 
two logical arguments, whereas the control path SETQ forms 
are effectively unlimited as to number of logical arguments. 
This seemingly arbitrary constraint becomes under standabl e 
in view of structural implementation in the respective 
paths- In the data path, all logic must be implemented by 
cascades of two input gates- In the control path, all logic 
is implemented by a Weinberger array, which has no practical 
limit (except speed, pin count, and chip size) on the number 
of inputs. 

Furthermore, the data path combi nat i onal logic 
restrictions are less strict ( str uc tur al 1 y speaking) than 
are the control path logical structures- For instance, in 
the data path all comb i nat i onal logic structures are derived 
from NAND, NOR, and NOT gates, and implemented as macro 
organelles- In the control path, however, ail logic 
structures are constrained to be NOR gates- The basename - bb j 
file that results from a basename-mac file indicates all 
control path comb i nat i onal logic implemented as NOR 
operations- Figure 2-25, casand.obj, shows the NOR function 
used to perform the AND function in the control path- All 
control path combinational logic operations are implemented 
in this fashion, as in the more common PLA- 



;CASAND.MAC 

{SOURCE CODE FOR ALGORITHMIC CREATION OF LOGICAL 
;<and> FUNCTION BY MACPITTS SILICON COMPILER//2 
(pro 9 ram casand 1 

(def 1 ground) 

(def a signal Input 5) 

(def b signal Input 6) 

(def c signal output 7) 

( def 2 ph 1 a ) 

( def 3 ph 1 b ) 

( def 4 ph 1 c ) 

(def 8 power ) 

(always 
(cond (a 

(setq c (and a b) ) ) 

(b 

(setq c (and a b) > ) ) > ) 



Figure 2.24 Casand.mac 



( ( (destination c> 

(source a) 

(source b) 

( 1 ogo casand ) 

(word- 1 ength 1 ) 

( ground 1 ) 

( s 1 gna 1 a 1 nput 5 ) 

(signal b Input 6) 

(signal c output 7) 

(phia 2) 

(phib 3) 

(phic 4) 

( power 8 ) ) 
nil 
n 1 1 

((( s 1 gnal -output c) (nor ((primitive (gate 4 ))))) 

((gate 4) (nor ((primitive (gate 3)) (primitive (gate 2))))) 

( ( ga te 3 ) 

( nor 

((primitive (gate 1)) (primitive (gate 0)) (primitive ( s 1 gna 1 - 1 nput a>>> 
((gate 2) (nor ((primitive (gate 1)) (primitive (gate 0)))>) 

((gate 1) (nor ((primitive ( s 1 gna 1 - 1 nput a))))) 

((gate 0) (nor ((primitive ( s 1 gna 1 - 1 nput b>))))) 

( ( 4 (phic)) 

(3 (phib)) 

(2 (phia) ) 

( I ( ground ) ) 

( 8 ( power ) ) 

(6 (Input b ( s 1 gna 1 - 1 nput b))) 

(5 (Input a ( 3 1 gna 1 - 1 nput a))) 

(7 (outputs c ( s 1 gna 1 -output c))))) 



Input gate// 



I 



Figure 2.25 CasancJ.obj 
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The AND plane in an NMOS PLA is actually comprised o-F NOR 
gates, its function is logical AND, but its constituent 
circuits are NOR gates. The NOR structure which the control 
path uses is also different topologically from that used in 
the data path. 

A concise review of the data path and control path 
variable types illustrates the usage differences: 



data type 

BOOLEAN (true , f al se) 

STORAGE 

ELEMENT flag 



INTEGER (word valued) 
regi ster 



NON- 

STORAGE 

ELEMENT signal ( i nput , i nt er nal ) port (all types) 



All storage elements are implemented as master- 
slave flip-flops. They retain their value until a new value 
is clocked into them. The flags are one bit wide, and are 
two— state devices, either true or false- The registers have 
a capacity of the data path width as declared in the 
initial PROGRAM statement in the MacPitts source program 
written by the chip designer. 

Non-storage elements are used primarily for data 
commun i cat 1 on within a clock cycle, where clock cycle here 
is taken to refer to the command interpreter clock cycle, 
and not one of the three off-chip clock phases which a 
MacPitts design requires- The det er mi net i on of the value of 
these non-storage elements is geermane to pipelined digital 



machi nes- 



When used in any application, 



care must be taken 



so that their value is the one necessary tor subsequent 
stages ot logic. A thorough under standi ng ot the counter- 
intuitive parallelism inherent in MacPitts (the language) is 
necessary to avoid mistakes here. MacPitts is not like the 
standard sequentially executed higher level languages. There 
are at least three levels ot implicit parallelism possible 
in a MacPitts algorithm, and an understanding of parallel 
operations is necessary to avoid functional errors. This 
consi derat i on is germane to MacPitts programming, and will 
be considered in detail later in this Chapter and in 
Chapters III and IV. 

The next-to-last line in Figure 2.24 illustrates a 
conditional. The (b ...statement is a checked <condition> 
argument of the beginning COND (do upon condition) 
statement, as is (a... . If condition a is false, and 
condition b is false, then no output is SE’TQ'd. Intuition 
would suggest that the output would then either remain at 
its last value or transition to tristate, neither of which 
is correct. The output is pulled low by the Weinberger arrav 
circuitry. This is evident in Figure 2.27 the Weinberger 
array from casand.mac, and in Figure 2.2S, the logic gate 
equivalent. The (cond (t ... form can be used to set a 
desired output, but it is usually better suited as a default 
condi t i onal , 




Figure 2.26 Casand.ci-f 
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Figure 2.27 Casand.cif Weinberger Array 
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MacPitts does 



not view this algorithm as a 



usual 



high level sequential test, however, but rather as a 
parallel test of a and b. The non— i nt ui t i ve parallelism of 
MacPitts was mentioned in the previous paragraph, as was the 
similarity of the MacPitts COND statement to the Pascal CASE 
statement. Some elaboration will serve to clarify this 
necessary concept. MacPitts evaluates all of the forms 
within the scope of a COND statement in parallel, in a 
mutually exclusive fashion. With regard to mutual 
exclusivity, it is then similar to the CASE statement; each 
condition under the scope of a COND can be modelled as a 
f 1 ow-of -control switch, either turning on the evaluation of 
its constituent forms or else skipping over their 
evaluation- The analogy does not hold further than this, 
however, because MacPitts evaluates all of the conditions 
under a COND in parallel- The object code created from a 
MacPitts source file illustrates this well. An example is 

(cond 



(hot 


(setq 


f an 


_on 


t) ) 


(col d 


(setq 


fan 


_on 


•f ) ) 


( ok 


( setq 


fan 


_on 


i) ) 



Where hot, cold, and ok are Boolean variables 
(signals or flags), fan__on is in this case a Boolean signal 
output which is to be turned on (t) or off (f) depending on 
an input temperature signal- COND forces parallel evaluation 
of these three conditions under its scope, hot, cold, or ok. 



The last parenthesis in this -fragment closes the beginning 
parenthesis prior to the COND, bringing the three conditions 
under its scope- Since these conditions are evaluated in 
parallel, a better code -fragment would be 

(cond 



(hot 


( setq 


■Fan 


_on 


t) > 


(col d 


(setq 


•fan 


_on 


■f ) ) 


(t 


(setq 


■Fan 




■f > ) 



where the last line indicates TRUE, i-e-, it is alv^jays true- 
Since COND evaluates in parallel v^iith mutual exclusion based 
upon order, i -f either o-f the -first two conditions is true, 
then the remaining conditions are not evaluated- 1+ neither 
o-f the -first two conditions is true, however, then the fan 
will be turned off- This code fragment permits one less 
signal input (or one less flag used) on the chip, and use of 
the TRUE t condition should always be considered. Its use is 
not necessary, as indicated by the first code fragment- 

MacPitts produces an accompanying object code which 
structurally resembles the following fragment 

(if 

(par (hot ... ) 

(col d . - - ) 

(t - - . ) ) ) 

where the COND translates to an IF, cnnd the parallelism of 
MacPitts is evident in the PAR (paral 1 el i ze) embracing the 
three constituent, conditions under the COND- Parentheses are 



as important in MacPitts as they are in LISP- 



In the last 



line above, there are three closing parentheses- The 
innermost closes the TRUE condition, the middle parenthesis 
closes the PAR (paral 1 i zati on of condition checking), and 
the outermost closes the IF (cond) statement. 

The LISP object file of casand-mac in Figure 2-25 
indicates the LISP equivalent of the MacPitts (language) 
algorithm, and shows how LISP views the NOR gate inputs as 
primitives- MacPitts is also able to compile a chip layout 
directly from a LISP object code- This is an option for the 
designer who is fluent in LISP in that customizing of the 
code and hence the chip's structure is possible- RVLbI-3 
CRef- 6:p- 4] describes how to create a chip design from an 
existing LISP object file- 

Figure 2.26 shows the chip resulting from 
casand-obj- The pads are all placed clockwise around the 
periphery of the chip in the order specified in the -mac 
file (Figure 2-24). This built-in function of MacPitts 
lends itself to both errors and possibilities of 
improvement- It is easy to identify pad function if the 
MacPitts algorithmic source file (written by the designer j 
is at hand- 

Figure 2-27 also shows the topological difference 
between the data path and the control path. In previous data 
path circuits, all comb i nat i onal logic was implemented with 
recognizable NMGS logic gates- In the data path, the 
Weinberger array is made up of many vertical metal 



col Limns 



with perpendi cul ar polysilicon lines cutting across them. 



Figure 2.28 illustrates the structure more clearly. 




Figure 2.28 Gate Equivalent ot Casand Weinberger Array 

In Figure 2.26 the Vdd input rail did not connect 
with the main Vdd bus (it has been corrected in Figure 
2.26). It passes through the polysilicon vias and stops 
abruptly. The reason for this is the expectation of minimum 
chip size which MacPitts harbors. For any but the simplest 
of chips, the Vdd comb will eKtend out to the input Vdd 
rai 1 . If it does not, the Vdd pad can be placed almost at 
will by modifying its position in the basename.mac file. 
RVLSI-3 discusses this CRef. 6:pp. 11— 13D. The designer 

can exercise a fair amount of latitude in pad placement, and 
MacPitts will accommodate most of the time. The suggestion 
in RVLSI-3 that GND be placed near the beginning and Vdd be 
placed near the end is a good one. The main problem here 
would arise if GND were placed on the right side of the chip 
so that it contacted the Vdd comb (which it will do if care 
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is not exercised). MacPitts places pads exactly in the order 
specified, and does no pad functional error — checking. 
Similarly, if a pad is dual -def i ned , MacPitts permits it 
with no diagnostics- This extends to the same pad being used 
for both Vdd and an input signal- So care is important in 
both pad speci f i cat i on and positioning- 

There exists the possibility for some improvement 
in chip speed by designer intervention in specifying the pad 
location. By moving pads five, six, and seven (input and 
output signals in casand-mac) closer to the Weinberger 
array, the metal run lengths can be reduced and thus the 
metal to substrate capacitance. This results in a somev-jhat 
faster chip, all other factors being equal - 

Figure 2-27 is a blowup of the Weinb€?rger array 
generated by casand-mac, and Figure 2-28 is its logic gate 
equivalent- The Weinberger array is a versatile PLA-like 
structure generally used to implement sequential logic- In 
this chip, as an unclocked circuit, it implements a. 
combinational function- Weinberger array gate 
1 nst an t i at i on errors were first detected here (circled). 
Note the two half lambda gaps in the NOR gate inputs- By 
Caesar editing, unexpanding of affected cells, and grep- 
searching the -cif files it v*^as discovered that these errors 
occur whenever certain NOR gate inputs are invoked- The 
errors themselves were suspected to reside in the 
control -lisp file of the MacPitts source code- Two specific 



cells appear to generate these errors: par t i al -gat e-i nput 
( ' groLind-r i ght ) , and part i al -gate-i nput ('ground left). Each 
is one-halt lambda too short. Chapter VI will treat the 
solution ot this problem. The hlacPitts command interpreter 
does not detect this type ot error, since it only exercises 
the algorithm. Lyra or a similar design rule checker will 
detect this error. The designer would do well to visually 
note MacPitts' inherent errors and correct them prior to 
submission to a design rule checker (drc). 

3. A Control Path OR Gate 

Figure 2.29 illustrates the MacPitts algorithm to 
create a two input OR realization in the control path- The 
OR function is realized by a selective SETQ choosing 
process, in a similar fashion to the previous AND 
real i z at i on . 

Figure 2-30 is the Weinberger array logical unit of 
casor.cif. The inputs are brought in on either side, and the 
output comes out from the middle of the structure- The same? 
i nstant i at i on errors as in the previous chip were qenerated- 
Partial-gate-input (gnd left) is depicted in the upper left 
of the stipple plot, and par t i al -gat e-i nput (gnd right) is 
depicted in the lower right of the plot (circled) in Figure 
2 - 30 - 

The logical operation of the Weinberger array could 
stand some cl ar i f i cat i on - Figure 2-31 depicts a gate-level 
functional r epresentat i on of Figure 2-30, the control path 



;CASOR,MAC 

iSOURCE CODE FOR ALGORITHMIC CREATION OF LOGICAL 

;<or> FUNCTION BY MACPITTS SILICON COMPILER//n Input gate// 

(program casor 1 

(dof 1 ground) 

(def a signal Input 5) 

(def b signal Input S> 

(def c signal output 7) 

(def 2 phla) 

(def 3 phib) 

(def A ph 1 c ) 

(dof 8 power ) 

I 

(always 



(cond (a 










( setq 
(b 


c t) ) 








( sotq 


c t) > ) 


) 


) 


) 



Figure 2.29 Casor . mac 



implementation o-f a two input COND-test OR structure. 
Looking at Figures 2.30 and 2.31 and 2.28, the -function will 
be explained. Figure 2.30 has tour depletion mode 
transistors (control columns to MacPitts). The left most 
transistor is the -first inverter in Figure 2.31. The next 
column in Figure 2.30 serves as the top NOR gate in Figure 
2.31. Moving right in Figure 2.30, the next column is the 
output inverter. And the rightmost column is the lower NOR 
gate correspond! ng to Figure 2.31. When viewed as a gate 
level equivalent, it can be seen that the Weinberger arrary 
is both larger and slower than its data path equivalent (c-f. 
Figure 2.6). In the control path, the signal requires 
appr ox i matel y -four gate delays to propagate -from input to 
output. This slowness has been somewhat mitigated by 
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Figure 2.30 Casor Weinberger Array 









Figure 2.31 Gate Equivalent o-f Casor Logic 



the large aspect ratio of the pupllup transistors (bottom, 
Figure 2.30). The comparable logic gate in the data path 
only requires approximately two gate delays, one for the NOR 
gate and one for its subsequent inverter (Figure 2.7). 

This simple COND-driven control path OR gate serves 
as an indication of how MacPitts constructs similar yet more 
complicated Weinberger Array structures. The decision logic 
is quite unlike that of a PLA. In a standard NMOS AND plane- 
OR plane PLA, a signal may experience at most four gate 
delays (considering input and output inverters both active, 
and pass transistors inducing a very small time delay ). For 
this simple OR circuit, a gate delay of approx i mat el y four 
is realized. The cascading of NOR and inverters induces even 
more delay for more complicated Weinberger array circuitry. 
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. A Four Input OR Gate In The Control Path 

A quad-input OR structure is speci-fied 
al gor i thmi cal 1 y in Figure 2.32. The OR logic which is 
implicit in liacPitts spec i -f i cat i ons is perhaps clearer here 
than in the two input OR structure. The COND statement 
forces a Boolean test of each input, and selects the 
appropriate output. To reiterate, the COND statement and its 
attendant forms can be viewed as the if— then-el se construct 
of many higher level languages. The difference is that 
MacPitts tests the condition forms in parallel, and not in a 
serial fashion as most higher level software compilers 
would. The mutual exclusivness of the <condi t i ons?> is 
determined by serial order , however, even though the testing 
of the conditionals is done in one clock cycle (or in 
paral lei). 

V ;QUADOR.MAC 

V ;SOURCE CODE FOR ALGORITHMIC CREATION OF LOGICAL 

;<or> FUNCTION BY MACPITTS SILICON C0MPILER//4 Input gate// 
(program quador 1 

(def 1 ground) 

(def a signal Input 5) 

(def b signal Input 6 ) 

(def c signal Input 7) 

(def d signal Input 8) 

(def e signal output 9) 

(def 2 phia) 

(def 3 phib) 

(def 4 phic) » 

(def 10 power) 

( a Iways 



< a 






( setq 
(b 


e 


t) ) 


( setq 
(c 


e 


t) ) 


( setq 
(d 


e 


t ) ) 


( setq 


e 


t) ) 


Fi gure 


2.3: 
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This is reflected in in the resulting structures. 
Figure 2-33 shows the labelled Weinberger array resulting 
•from quador.mac, and Figure 2-34 is its logic gate 
equivalent- A strength of MacPitts is that it forces the 
designer to consider both behavior and structure while in 
the process of writing the driver algorithm- This is 
considered to be ad vantageous , inasmuch as the abstractness 
factor is minimal- There are two broad categories of 
silicon compilers, behavior oriented (e-g-, hacPitts), and 
structure oriented (e-g-, Bristle Blocks). In Bristle Blocks 
and most other register transfer logic (RTL) silicon 
compilers, a structure is the fundamental building block. 
The structures (register, adder, ALU, gate) must be 
connected appropr i atel y to implement the desired behavior. 
In hacPitts, the desired behavior of the chip is the input 
to the silicon compiler and the chip which implements this 
behavior is the output- The experienced designer is aware of 
the structure that results from a given behavioral 
sped f i cat i on , and has the latitude to optimize the 
algorithm accordingly. This has been mentioned previously/, 
regarding pad placement and CGND- Optimization will be 
treated further later in this thesis- 

5- A Four In put AND Ga te In The Co ntra 1 Path 

Figure 2.35 shows the algorithm to create a four 
input AND gate in the control path, and Figure 2-3o shows 
the Weinberger array from the logic block of quadand-cif- 




Figure 2.33 Quador Weinberger Array 




Figure 2.34 Gate Equivalent o-f Quador Logic 
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Note the errors generated in this simple -four input, 



one 



output circuit (circled, Figure 2.36). There are seven gate 
gap errors (all part i al -gat e-i nputs) , and three alignment 



errors. The alignment errors are actually derived -from mis- 



translation o-f the Weinberger array interface cell by 



MacPitts (the program) . The interface cell is created with 



the proper pitch, set aside in the VAX 11/780's memory, then 



invoked and its image translated to the proper position in 



the upper-left of the Weinberger array. By convention, upper 
left on the MacPitts chips refers to the nominal position of 



the GND pad, position one. 



' ;QUADAND.MAC 

, ;SOURCE CODE FOR ALGORITHMIC CREATION OF LOGICAL 

;<and> FUNCTION BY MACPITTS SILICON C0MPILER//4 Input gate// 
(program quadand I 

(def I ground) 

(def a signal Input 5) 

(def b signal Input 6) 

(def c signal Input 7) 

(def d signal Input 8) 

(def o signal output 9) 

{ def 2 ph 1 a > 

{ def 3 ph 1 b > 

{ def 4 ph I c > , 

(def 10 power) 

{ a Iways 



( cond (a 












{ setq 
(b 


e 


(and a 


b c 


d) ) 


) 


( setq 
(c 


e 


(and a 


b c 


d) ) 


) 


{ setq 
(d 


e 


(and a 


b c 


d) ) 


) 


( setq 
(t 


e 


(and a 


b c 


d) ) 


) 


{ setq 


e 


f 




) 


) 






) 




) 


) > 


Fi gure 


2.35 


Quadand . 


mac 


So what appears to 


be 


three 


separate al ignment 



errors is actually just one cell translation error. This 



6 <;> 



error should be repairable in the macr o-i nst ant i at i on 
portion of MacPitts, although further investigation will 
consider also the possibility of an error in program 
i nst al 1 ment . 

A Input OR Gate In The Contr ol Path 

It was stated previously that MacPitts will permit 
no more than five deep cascading of the same gate organelle 
in the data path- This is not the case, however , in the 
control path- Figure 2-37 shows a MacPitts algorithm to 
create a 16 input OR circuit- Note again how natural the 
specification is, and the intuition it gives into both 
behavior and structure- To reiterate: in the data path, one^ 
specifies structure explicitly and the implicit behavior 
results- In the control path, one specifies behavior 
explicitly, and the implied structure (always a Weinberger 
array) results (cf- Figure 2-13, data path AND, Figure 2-25, 
control path AND)- The suggestion is to specify as much 
combinational logic as possible in the control path (this 
decision fortunately never arises because MacPitts is not 
primarily a combi nat i onal logic design tool ) » 

In program multi or -mac the data path width is still 
one- The data path width actually refers to the number of 
outputs from the chip (in the absence of a data path) , not 
as its name would lead one to believe- So with one output, 



the data path width is one, even though there are 16 inputs. 




Figure 2.36 Quadand Weinberger Array 
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The -Format -for data path width spec i -F i cat i on is 



(program <program name> <data path width> 



Figure 2-38 shows the chip structure o-F mul t i or - c i -F , It is 
seen that the chip is composed o-f a small un-clocked control 



path 


uni t 


al one , 


i n the 


middle 


of the Weinberger 


Vdd/ GND 


comb . 


There 


are 


no data 


path 


organel 1 es. 


As 


previ ous 


ex per i ence 


woul d 


suggest , 


thi 5 


control path 


has 


several 



i nstant i at i on gap errors and cell translation errors (see 
Figure 2.25). The large number of depletion pullup 
transistors inherent to the Weinberger array is also 
apparent- Combi nat i onal logic implementation in the control 
pa-th typically requires more depletion pull ups than would be 
required for the equivalent structure in the data path, 
because all control path logic is done with NOR gates- Since 
the pull ups are always turned on, a MacPitts chip is not 
expected to be very conservative of power- In the four input 
□R gate, there were eight pull ups in the Weinberger array, 
and seven i nst an t i at i on gap errors- In the 16 input OR 
circuit, there are 30 pullup transistors, and appr ox i matel y 
40 gap errors. These errors are caused by i nstant i at i on of 
the part i al -gate-i nput cells ( spec i f i cal 1 y , partial-gate- 
input-ground-left and part i al -gat e-i nput-ground-ri ght ) , and 
they occur every time one of these cells is called. 



;MULTIOR,MAC 

;SOURCE CODE FOR ALGORITHMIC CREATION OF LOGICAL 

;<or> FUNCTION BY MACPITTS SILICON COMPILER//16 Input gate// 

(program multlor 1 



( def 


1 


ground ) 






(def 


2 


ph 1 a ) ( def 2 


phlbHdef 3 phic) 


( def 


a 


s 1 gna 1 


1 nput 


5) 


(def 


b 


signal 


1 nput 


6) 


(def 


c 


8 I gna 1 


1 nput 


7) 


( def 


d 


signal 


Input 


8) 


( def 


e 


s 1 gna 1 


1 nput 


9) 


( def 


f 


s I gna I 


1 nput 


1J0T) 


( def 


g 


s 1 gna 1 


1 nput 


11) 


(def 


h 


s 1 gna 1 


1 nput 


12) 


(def 


1 


signal 


1 nput 


13) 


(def 


J 


s 1 gna 1 


1 nput 


14) 


(def 


k 


s 1 gna 1 


1 nput 


15) 


( def 


1 


s 1 gna 1 


1 nput 


16) 


( def 


m 


s 1 gna 1 


Input 


17) 


( def 


n 


s 1 gna 1 


Input 


18) 


(def 


o 


signal 


1 nput 


19) 


( def 


P 


signal 


Input 


20) 


( def 


q 


signal 


output 21) 


(def 


22 


I power ) 






(always 






( cond 


( a 










( setq 


q t) ) 






(b 










( setq 
(C 


q 


t) ) 






( setq 
(d 


q 


t) ) 






( setq 
( e 


q 


t) ) 






( setq 
(f 


q 


t) ) 






( setq 


q 


t) ) 






<9 

( setq 
(h 


q 


t) ) 






( setq 

( 1 


q 


t) ) 






( setq 
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t) ) 






(J 

( setq 
(k 


q 


t) ) 






( setq 

( 1 


q 


t) ) 






( setq 
( m 
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t) ) 






( setq 
(n 
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t) ) 






( setq 
(o 
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t) ) 






( setq 
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t) ) 






<P 










( setq 


q 


t) ) 



Figure 2.37 Multior.mac 
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Figure 2.38 Multior.ci-f 
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MacPitts is limited in the data path as to how many 



combi nat i onal logic cascades may be made- Since the control 
path is designed to make decisions, the combi nat i onal logic 
cascading constraint is absent tor most practical chips. 
Neverthel ess , an error was detected in the mul t i or . c i t -file, 
Figure 2.38. From multi or. mac in Figure 2.37, one would 
expect the chip to have 22 pads, 16 input pads, one output 
pad, three clock pads, one ground pad, and one Vdd pad. The 
citplot only shows 21 pads- This error does not show in the 



command 


interpreter. The 16 input OR tunction works as 


expected 


there. The error apparently lies elsewhere than in 


the .mac 


tile- The chip does tunction nevertheless, but as a 


15 input 


OR gate instead ot as a 16 input OR gate. The pad 


del et i on 


error (one tewer pads instantiated than specified 


i Vi the 


-mac tile) occurs whenever an OR gate having more^? 



than -five inputs is specitied in the .mac tile. This is an 
unexpected error, though not very serious. The control path 
IS rarely called on to do this sort ot logic. It a special 
tunction ot this type is required ot a MacPitts chip, the 
designer can circumvent this problem by specifying an extra 
input pad in the .mac tile. The chip will compile to cir, 
but the extra pad will not be instantiated nor will any ot 
the attendant combi nat i onal logic or wires- 



7. 


Control Path Semantics 




The syntax (algorithm rules) tor comb i nat i onal 



logic in the control path has been illustrated in the 



previous sections. To gain an understanding ot MacPitts, the 



semantics (what the algorithm means) is more important than 
how to say what it means. 

The parallelism possible in MacPitts has been 
previously re-Ferred to in the discussion of parallel testing 
of conditions under a COND statement. This is not the only 
place where MacPitts forces parallelism. Parallelism is also 
forced upon all <actions> within a true condition under a 
COND. The general form of a COND statement is 

(cond ( <condition> <actions> <transition> )) 

The <condition> is a Boolean variable upon which the 
true/false test is made, the <actions> are SETQs, and the 
<transition> is one of GO, CALL, or RETURN (to be discussed 
in Chapter IV) . In the previous example, both hot and cold 
v^^ere Boolean conditional variables which would be tested in 
parallel- The <actions> under the COND refer to a set of 
SETQ assignment operators, and the SETQ's under a COND are 
all done in parallel, or simultaneously. The < t r ansi t :i. on 
form indicates a state transition to be made if <condition> 
is evaluated as true. This state transition occurs in 
parallel (same clock cycle) with the <actions> associated 
SETQ' 5. The state transition mechatnism of MacPitts is very 
str ai ghtf orv-jard and natural to a designer familiar with 



Mealy type -finite state machines. This topic will be 
considered in depth in Chapters IV and V. 

Note the difference between the parallelism implied 
within the COND and that parallelism implied in condition 
evaluation- The conditions are all examined in parallel, and 
for the first one that evaluates to logical TRUE, all forms 
within its scope are executed in parallel- This high degree 
of implicit parallelism makes MacPitts ideally suited for 
pipelined archi tectures- Consider the following code in 
which three Boolean conditionals determine the outputs. The 
destinations of the SETQs are also Boolean, and in this case 
are non-storage elements (signals). The outputs are declared 
signals instead of flags (which are storage devices) so that 
when they are not set within a clock cycle they will 
transition to false- 



(hot 



(setq 


f an_on 


t) 


( setq 


wi ndows_open 


t) 


( setq 


door s-open 


t ) 


(setq 


heater _on 


f ) ) 


( setq 


f an_on 


f ) 


(setq 


wi ndows_open 


f ) 


( setq 


door s_open 


f ) 


( setq 


heater _on 


t) ) 


( setq 


wi ndows_open 


t) 


(setq 


door s_open 


t ) ) 



) 



Th i s 



algorithm models a simple digital home 

1 nact 1 ve 



temperature controller where f refers to an 



or 



closed device, t re-fers to an active or open device- and a 
com-fortable temperature deadband exists between heating and 
cooling requi rements. All three Boolean conditions (hot, 
cold, and true) are tested in parallel- The order ot mutual 
exclusion is the order in which the conditions are written 
(it both cold and t are true simultaneously, only the 
actions under cold will be executed)- The conditional (t--» 
is the MacPitts equivalent ot a reserved word, and indicates 
the always true conditional- It is used in this algorithm as 
the default state of the system, where the temperature is 
comfortable enough to leave both the doors and windows open- 
Even though (t-.- is always true, the evaluation order of 
the conditionals prevents the forms under its scope from 
being set unless both the preceedinq conditionals are false- 
The actions under each true condition are also perforfTieo in 
parallel, or in the same clock cycle- So the te?sting of ail 
three conditions and the resultant SETQ <actions> occu.r^ in 
only one clock cycle, due to the implicit parallelism of 
rlacPitts- It is not necessary for the MacPitts programmE^r co 
explicitly parallelize the forms under a COND, the MacPitts 
compiler does this every time it encounters a CGNL)- The 
(setq <output> f) statements under the hot and cold CONDs 
are not required for this system- As explained previouslv, 
the Weinberger array will set the output false if it is not 
explicitly driven true for non-storaqe Boolean variaijles- 
The (setq <output> f) statements have the advantage of added 



clarity in the MacPitts driver algorithm at the expense of 



increased size of the Weinberger array (more decisions are 
requi red ) - 

The following code fragment produces the same 
results, though is somewhat more obscure: 

<al ways 
(par 

(setq fan_on hot) 

(setq heater_on cold) 

(setq windows_open (not cold)) 

(setq doors__open (not cold)))) 



In 


this 


ex 


ample, no 


condi tional testing 


i s 


necessary 


al though 




the results 


are equivalent to 


the 


previous ei 


xampl e- 


On 


every clock 


cycle, all of the 


forms 


embraced by 


PAR are 


ex 


ecLited- On 


each clock cycle, the 


f an , 



heater, windows, and doors are set to the correct state- The 
resulting hardware is simpler, since fewer decisions are 
required. This is the preferred format when conditional 
testing can be explicitly done with Boolean logic in the 
Weinberger array- But this code fragment lacks the ability 
to branch- When transfer of control is required, then it is 
necessary to use the full generalized COND statement 



cond ( <condi tional > <actions> <transition> ) 



form instead of the truncated version 



cond ( <condi ti onal > <actions> ) 
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F i ve Input AND Gates In The Control Path 
The savings of area in the Weinberger array can be 
substantial when Boolean decisions are made without a 
precedent COND statement. Figure 2.39 shows the MacPitts 
code used to generate a five input AND gate using CDND for 
each output, and Figure 2.40 shows the resulting Weinberger 
array. Figure 2.41 is the logic gate equivalent of the five 
input COND driven AND gate. Contrast this with Figure 2.42 
illustrating the code for generation 



;FIVEANO.MAC 

;S0URCE CODE FOR ALGORITHMIC CREATION OF LOGICAL 
;<and> FUNCTION BY MACPITTS SILICON C0MPILER//5 
(program fiveand 1 



( def 


1 


ground > 








(def 


a 


s 1 gna 1 


Input 5) 




( def 


b 


signal 


Input 6) 




( def 


c 


s 1 gna 1 


Input 7 ) 




(def 


d 


signal 


Input 8) 




(def 


e 


s 1 gna 1 


Input 9) 




( def 


z 


5 1 gna 1 


output 10) 


(def 


2 


ph la ) 








(def 


3 


phib) 








(def 


4 


ph 1 c ) 








(def 


11 


power ) 








(always 








( cond 


( a 












( setq 
(b 


z 


( and 


a 






( setq 
( c 


z 


( and 


a 






( setq 
(d 


z 


( and 


a 






( setq 


z 


( and 


a 






( e 












( setq 
(t 


z 


( and 


a 






( setq 


z 


f 





c d 
c d 
c d 



e) 
e > 
e> 
e ) 
e ) 



) ) 
) ) 
) ) 
) ) 
) ) 
) ) 



Input gate// 



) 



Figure 2.39 Fi veancJ . mac 



o-f a 'five input AND gate in the Weinberger arr 
CONDs, Figure 2.43, the resulting Weinberger 
generated by MacPitts, and Figure 2.44, the 



ay without 
array logic 
logic gate 



71 




O P Q R S T 



U V W 



Figure 2.40 Weinberger Array -from Fi veand . ci -f 
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Figure 2.41 Gate Equivalent o-f Fiveand Logic 
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equivalent o-f a -five input AND gate without CONDs. 



The 



second structure is Tar simpler topologically, having only 
six pullup transistors. The Weinberger array which achieves 
the same results with CONDs, Figure 2.39, requires twelve 
pullups by comparison. Since fewer explicit decisions need 
to be specified, even the code of the COND-less chip is more 
terse than its CDND decision counterpart. In comparing the 
logic gste circuit equivalents, the five input AND gate 
created with CONDs requires six inverters and six NOR gates, 
and the NOR gates have fan-ins of five, six, seven, eight, 
and nine. There are four levels to this structure. The five 
input AND gate created without CONDS has only five inverters 
and one NOR gate with a fan-in of five, and there are two 
levels of gates. The circuit created without CONDs is 
smaller, simpler, and faster. 

;SIMPL5AND,MAC 

;SOURCE CODE FOR ALGORITHMIC CREATION OF LOGICAL 

;<and> FUNCTION BY MACPITTS SILICON C0MPILER//5 Input gate// 



(program sImpISand 1 






( def 


1 ground) 






(def 


a signal 


1 nput 


5) 


(def 


b signal 


1 nput 


6) 


(def 


c signal 


1 nput 


7) 


(def 


d s 1 gna 1 


1 nput 


8 ) 


(def 


e s 1 gna 1 


1 nput 


9) 


( def 


2 s 1 gna 1 


output 


10) 


( def 


2 ph 1 a > 






(def 


3 ph 1 b ) 






(def 


4 ph 1 c ) 






( def 


1 1 power > 






( a 1 ways 







(setq 2 (and a b c d o) > ) ) 



Figure 2.42 Si mpl Sand . mac 
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Figure 2.43 Weinberger Array -from Si mpl 5and . ci -f 
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Figure 2.44 Gate Equivalent o-f SimplSand Logic 
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The economics oi using CONDless algorithms does not always 



justify their use. Silicon compilation is intended to free 
the engineer from the micro-design aspects of creating a 
chip, and Boolean minimization <see the home temperature 
controller example) is a step away from this goal. 
Typically, the control path is not used to implement 
combi nat i onal logic functions, but rather to provide 
controlling inputs to data path operations. The decision to 
signal on five simultaneous TRUE inputs would always be done 
as shown in Figure 2.42, and not as in Figure 2-39, but this 
decision would usually have a COND embracing (around) 
itself. The COND in NacPitts is used for decision- Attempts 
to minimize CONDs will lead to a loss of clarity in the 
algorithm (see the simplified home temperature controller 
example)- Never thel ess , if the Weinberger array becomes too 
large and slow, Boolean reduction techniques such as Quine- 
McCluskey or Karnaugh maps should be considered- 
9- A E<ett er 15 Input Control Path OR Gate 

A remarkable power savings in the Weinberger c'lrray 
can be expected where this alternate algorithm (explicit 
specification of outputs without use of COND testing) is 
feasible. Figure 2,45 depicts another method of 

algorithmically specifying a sixteen input logical OR 
selector in the control path (compare with Figure 2-37) 
Figure 2,38 shov^js the resulting layout from the algorithm 
using multiple CGNDs for selection., and Figure 2,46 shows 



the Weinberger array layout resulting from the algorithm 
using just Boolean logic specification. Figure 2.47 shows 
the logic gate equivalent of Figure 2.46. 



;SMPLMLTR.MAC 

;SOURCE CODE FOR ALGORITHMIC CREATION OF LOGICAL 

;<or> FUNCTION BY MACPITTS SILICON C0MPILER//16 Input gate// 

;a simplified structure resulting from elimination of "cond* 



(program smp 


1 m 1 tr 1 






(dof 


1 


ground ) 






( dof 


a 


s Igna 1 


1 nput 


5) 


( def 


b 


signal 


1 nput 


6) 


( dof 


c 


s Igna 1 


1 nput 


7) 


( def 


d 


signal 


1 nput 


8) 


( def 


e 


s Igna 1 


1 nput 


9) 


( dof 


f 


s Igna 1 


1 nput 


10) 


( def 


g 


s Igna 1 


1 nput 


1 1 ) 


(def 


h 


signal 


1 nput 


12) 


(def 


1 


s Igna 1 


1 nput 


13) 


(def 


J 


s Igna 1 


1 nput 


14) 


(def 


k 


s Igna 1 


1 nput 


15) 


(def 


1 


s 1 gna 1 


1 nput 


16) 


(def 


m 


s Igna 1 


1 nput 


17) 


(def 


n 


s 1 gna 1 


1 nput 


18 ) 


( def 


o 


s Igna 1 


1 nput 


19) 


(def 


P 


signal 


1 nput 


20) 


(def 


q 


signal 


output 


21) 


(def 


2 


ph 1 a ) 






( def 


3 


ph 1b > 






( def 


4 


ph 1 c ) 






(def 


22 power ) 







(always 







( setq q ( or a b 


cdofghljklmnop) ) 


) ) 






Fi gure 


2-45 Smpl ml tr . mac 




Note 


in particular 


the 


difference in number of 


pul 1 up 


transi stors 


between 


the 


two circuits (Figures 2 


. 38 and 


2.46) . 


There are 


thirty pullups in the circuit 


created 


using 


COND 


testing , 


and 


only two pullups in the 


ci rcui t 


created trom 


the COND- 


“less 


algorithm. The pullup transistors 


are 


al ways 


turned 


on , 


and as a consequence 


consume 



proport i onal 1 y more power than transistors which are 
intermittently turned on. So a circuit power consumption 
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Figure 2.46 Weinberger Array -from Smpl ml tr . ci -f 
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savings can be realized by appropriate COND-less decision 
specification, where appropriate. But note that this is not 
always possible, nor is the COND-less algorithm always as 
clearly understood as the algorithm using COND for testing 
and branching. 

These logic decisions would all occur electrically 
in the Weinberger array (equivalently; occurring 
al gor i thmi cal 1 y in the compiled LISP object code), since the 
decision stipulations are Boolean and not integer. The forms 
for Boolean combi nat i onal logic and integer (word> 
combi nat i onal logic are syntact i cal 1 y different, and it is 
necessary that the MacPitts programmer understand this 
syntax difference in addition to the logical implementation 
difference described previously. 

i - T wo Con si der at i on s I_n Mac Pi t ts Pr oqr ammi nq 

MacPitts is both a programming language and a 
method of designing digital circuits- As such. the 
programmer must consider the consequences of syntax used in 
the driver algorithm (the -mac file). It is not alwavs 
apparent beforehand whether a given function should be done 
in the control path or in the data path- The choice is 
determined by the syntax used by the designer. 

Suppose a four input AND gate is to be designed in 
both the data path (word type) and in the control path 
(Boolean type), where a, b, c, and d are inputs and z is the 
output. The statement which relegates the decision to the 



data path 



i s 

(setq z (word-and a (word-and b (word-and c d) ) ) ) 

where a, b, c, d, and z must all be either ports or 
registers (integer valued). The cor r espond i ng statement t'or 
the control path is 

(setq z (and a b c d)) 

which requires that a, b, c, d, and z all be either signals 
or -flags (Boolean valued) - 

In complicated archi tectures and most sequential 
machines, this choice does not have to be made a priori, but 
rather will be made by synta)< in writing the MacPitts 
algorithm- In simpler arch i tectures , like a Hamming error 
detector or a Grey code decoder, this decision should be 
made be-forehand. The choice can be regarded as one between 
individual treatment o-f the data bits (usually done in the 
control path logic), or treating the data as n-bit words 
(done exclusively in the data path). Examples o-f algorithms 
to do Grey code decoding and Hamming error detection and 
correction are given in Chapters IV and VI. 

The MacPitts pr ogr ammer /desi gner must also consider 
the hardware rami -f i cat i ons o-f syntax. The algorithm chosen 
to implement a -function in MacPitts drives the circuit 
implementation to achieve that -function. 



It has been mentioned previously that COND forces 
conditionals to be tested in parallel, and their antecedent 
actions to be SETQ'd in parallel. This equates to silicon 
area/speed tradeoff on the chip. If multiple operations of 
the same type are to be done under a COND, MacPitts will 
instantiate copies of the required organelle, and perform 
the operations in parallel. Conversely, if the same 
operations are not put under a COND, MacPitts will 

instantiate only one copy of the organelle, and performi the 
operations serially. For instance, there are two ways to 
perform a set of three data path logical two-bit ANDs on six 
inputs. The first method does the operations in parallel, at 
the cost of silicon area. 

(cond (t 

(setq X (v^!ord-and a b) ) 

(setq y (word-and c d)) 

(setq z (word-and e f)) ) ) 

This algorithm fragment would execute in one clock cycle, 
but MacPitts would implement it with three data path AND 
gate organelles, each gate having tv-^o inputs. The slower 
algorithm would be 

(setq X (word-and a b)) 

(setq y (word-and c d)) 

(setq z (word-and e f)) 

The second example would require three clock cycles to 
execute, but only one data path AND organelle 



would be 



i nstant i ated . Similarly, PAR -forces all -forms within its 
scope to be executed in parallel. The best way to veri-fy 
this is to create a short FSM algorithm, and manually clock 
it while in the interpreter. (This is also an excellent 
method to optimize algorithms -for throughput by paralleling 
operations where possible and testing -for execution in the 
interpreter. The results may not be what is expected.) 

C. SUMMARY 

This chapter discussed the di -f -f erences between MacPitts' 
implementation o-f combi nat i onal logic in the control path 
and data path. The -fundamental di-f-ference is one o-f 
structure, which is driven by syntax. 

When the data type is de-fined Boolean, and the correct 
operations are applied to the bits, the combi nat i onal logic 
occurs in the control path. Control path logic is always 
done by a Weinberger array, an array o-f NOR gates. When the 
data type is de-fined as integer, and the correct operations 
are applied to the words, the comb i nat i onal logic occurs in 
the data path. The -fundamental units o-f the data path are 
two-input organelles, which are structural mappings o-f the 
syntactical statements NOT, AND, NAND , OR, NOR, XOR, 
i ncrement /decrement , and add/subtr ac t . The data path 
per-forms the arithmetic -functions and also generates signals 
to control -for decisions- Combi nat i onal logic syntax (and 



hence structure) in the data path obeys the -fundamental laws 



o-f Boolean algebra, such as associativity and commutativity. 
The designer must consider these laws in writing the 
MacPitts algorithm i -f correct -function is desired. 

The LISP-1 ike COND -form produces parallelism in 
MacPitts. The COND -form is a statement which (structural 1 y ) 
implements decisions in the Weinberger array and 
(al gor i thmi cal 1 y ) drives control -flow in both the .mac -file 
and the .obj -file. Control path structures may be reduced in 
size (where possible) by not using the COND -form to speci-fy 
output conditional setting. The alternative is the PAR 
(paral 1 el i ze) -form, which parallels all the -forms under its 
scope. The -forms embraced by PAR must be the -functional 
equivalents of those under COND, which requires designer 
intervention and possibly Boolean algebraic reduction. The 
result of this alternative is unconditional explicit 
assignment of outputs. This is feasible in simpler chips, 
and should always be considered on the basis of an 
engineering tradeoff between design time and chip speed. 

The COND statement, with multiple selections of 
conditionals, can be viewed as an implicit AND-OR structure 
realized in NORS in the Weinberger array. An alternate 
syntactical viewpoint of COND is the CASE statement. 

The gates created in this chapter are rather artificial, 
in that they were made to show just the structures desired. 
In practice, the combinational logic structures used are 
likely to differ slightly. 
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Ill- A SPEED-PQUJER COMPARISON BETWEEN A DATA PATH 
AND CONTROL PATH EQUIVALENT CIRCUIT 

A behavi or-"or i ented silicon compiler requires a high 
level algorithmic description of the chip's desired function 
as its input. The output is a machine readable low level 
geometric description of the resulting digital circuit, 
usually CIF (Caltech Interchanqeabl e Format), a language 
describing rectangles from which the various process masks 
and their relative locations are registered. When a CIF file 
is processed by Mosis (Metal Oxide Silicon Implementation 
Service), the desired chip results- 

Chapter II considered the qualitative effects of 
algorithmic syntax on some circuit structures in the data 
and control paths. It is also desired to do a quantitative 
investigation on functionally equivalent circuits in each 
path, and to compare the results. The circuits chosen are 
the five input AND gates in both their control path and 
data path conf i gurat i ons . Handcrafted versions of the five 
input AND gate are contrasted to the MacPitts five input AND 
gates- 

A. DATA PATH FIVE INPUT AND GATE 

Figure 3.1 shows the algorithm used to create a five 
input AND gate in the data path. Figure 3-2 shows the 
labelled cifplot of the four cascaded NAND organelles and 



four inverters, and Figure 3.3 is the logic gate eg 
of the citplot. The LISP object -file is included in 
A to show how MacPitts implements the data path AND 



nival ent 
Appendix 
function 
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(setq 2 

(word-and a(word-and b(word-and c(word-and d e)>)))) 



Figure 3.1 Data Path Five Input AND Gate .mac File 



by invoking the organelle AND four times. As discussed in 
Chapter II, the MacPitts algorithm produces the LISP object 
file, from which MacPitts (the silicon compiler) produces 
the layout. At run time, the MacPitts (silicon compiler) 
script file shown in Appendix A is created. The best way to 
create a script file of a MacPitts terminal session is to 
issue the command 



macpitts basename herald > basename. scr i pt 

where the option herald directs MacPitts to send 
messages (see compmesg.* files in MacPitts source 
the designated output device, ">" is the BSD Unix 



compi 1 er 
code) to 
redi rect , 
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Figure 3.2 Stipple Plot of Data Path Five Input AND Gate 
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basenaoie. scr 1 pt is the file into which the terminal session 
is to be recorded, and "?<" is the Unix command to put a 
process into the background. If the algorithm is not fully 
debugged, then issue instead 

macpitts basename herald 

so MacPitts diagnostics and Liszt diagnostics both will come 
to the screen, and no hardcopy recording will occur. It is 
possible to both monitor and simultaneously record the 
Macpitts compilation, by issuing the command 




Figure 3.3 Gate Equivalent of Figure -3.2 



script basename . scr i pt (starts script recording) 

to which Unix will respond with 

started, filename is basename. scr i pt " 
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Figure 3. 




4 Stipple Plot Showing Critical Nodes 
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then issue the full path command (a Unix bug requires this) 



/vl si /mac pi t /bi n /mac pi tts basename her al d 

and when compilation is done type control d to terminate the 
script recording. The script capability is useful for 
following the MacPitts compilation process, gives insight 
into how hacPitts works, and assists in debugging the driver 
algorithm. Tracing of MacPitts" compilation of an algorithm 
can then be done with a grep search on the compmesg.^ files 
for the statistics and the hi- lisp files for the herald 
messages. If the algorithm halts execution, the script file 
indicates where in the compilation process the error was 
detected- That part of the algorithm can then be checked for 
errors. 

The script of a MacPitts session also has informative 
material (statistics) on the chip size, components, maximum 
power used, and host computer effort expended to compile the 
chip. Carlson CRef. 2:p.43II describes the script file 
produced by a MacPitts compilation session- 

After the basename- cif file is produced by MacPittSn it 
is necessary to comment out the beginning user extension 
zero lines with the vi screen editor- This is done dv 
invoking vi on the cif file 



vi basename.cif 



and placing parentheses around these lines- Carlson 



CRe-f. 2: p.703 explains why this is necessary, 

file must next be created so labelling of nodes 
for Mextra (Manhattan Circuit Extractor). The 
convert a .cif file to a - ca file is 



The Caesar 
can be done 
command to 



cif2ca -o < off set > basename-cif 



where the offset is a number added to the Caesar symbol xx-ca 
files to distinguish them from previously created symbol 
files which might have the same number (xx). 

The procedure described above results in a MacPitts end 
product , the basename.cif file, and a version of that file 
amenable to editing in the VLSI graphics editor Caesar, the 
basename.ca file- For quantitative analysis of a MacFitts 
design, further steps are required. 

To begin this analysis, the nodes are labelled (in 
Caesar) for Mextra and Crystal (a timing analyzer). Work bv 
Froede CRef- 3:pp 63-803 addresses Crystal analysis of 
MacPitts circuits- After the input, output, GWD , and Vdd 
nodes are labelled, the following commands are issued 



: save 
and then , 

: c i f ~p 

in Caesar to save the new labelled - ca file and to create a 
-cif file with nodes at points (-p) for Mextra- Figure 3-2 



IS the point-labelled cifplot of the data path five input 



AND gate. Next Nextra is invoked on the labelled file by the 



command 



mextra ~o basename 

where the ~o switch causes more accurate capacitance 
calculation (than is done without -o) . Mextra produces the 
basename- nodes file, which can be checked for connectivity 
and to see that all labelled nodes are included. Appendix A 
shows the -nodes file for the data path AND gate- The 
basename, si m file is also produced , and can be used for 
switch level simulation with Esim, SPICE simulation, Crystal 
timing analysis, and power estimation with Powest- The 
berkS5 version of Crystal is the more useful (compared to 
the berk83 Crystal ) version- To record a Crystal session, 
start the script recording, and then call Crystal with its 
full path designator 

. /vl si /ber k85/bi n/crystal basename-sim 

Crystal has many options and commands. The 1985 version of 
the Crystal manual which describes them is available on the 
Naval Postgr aduate School VAX 11/780 in the file 

/vl si /ber k85/ doc /crystal /crystal - tbl ms 

Appendix A shows the script recording of a Crystal analysis 
of the data path AND gate- After the input and output nodes 
are assigned and the delay is given, the command 



cr i t i cal 



~g -f i 1 ename- dummy 



is issued, then Crystal is stopped with 
qui t 

and then script is terminated with control d- The critical 
command determines the t i me-cr i ti cal (i,e-, slowest) signal 
path, and the -g (graphical results) switch in conjunction 
with it creates a Caesar-compat i bl e tile ot the critical 
node locations as shown in Appendix A- This tile can then be 
added to the basename.ca tile by the sequence ot commands 

caesar basename (Caesar edit labelled tile) 

: source tilename (add critical nodes to screen) 

Since the Crystal nodes displayed in Caesar are not 
reproduced in cit, the nodes must be edited in Caesar it 
an annotated stipple plot is desired™ One technique is to 
erase the Cryst al “Sour ced (creatted by the : source command.) 
nodes, and replace them with implant layer squares (implant 
tor visibility and contrast) and then to relabel the deia\" 
times with Caesar's --label command- The revised Caesar tile 
can then be saved and converted to cit tor stipple plotting 
with the series ot commands 



: save 
and then 



: c i t -p 



Fi gure 



4 



shows the ci-fplot o-f the circuit with the 



critical nodes marked. The critical nodes lie along what 
Crystal considers the critical (slowest) path. The largest 
delay shown is the circuit cumulative delay, and each marked 
node indicates a cumulative delay. This makes it simple to 
determine the delay between critical nodes as the di-f-Ference 
between their successive cumulative delays. The stipple plot 
can be di-f-ficult to interpret i -f it is desired to determine 
what structure causes the delays. A gate equivalent o-f the 
ci-fplot can be help-ful in the analysis. The gate level 
equivalent o-f this circuit with marked cumulative delays is 
shown in Figure 3.5. The data path AND gate spreads the 
delay out evenly, with approx i mat el y 10 ns per gate, as is 
expected from the transistor aspect ratios shown in Figure 

r-, 

The maxi mum power consumed by the circuit can be 
determined in either o-f two ways. The MacPitts script 
session (o-f the compilation process) records it, or Pov*4est 
(Power ESTimator) can be used on the basename.sim file 
produced by hextra- Powest computes the power based on oniv 
the number of depletion transistors, assuming that they are 
on all the time (for the maximuum power figure) or on half 
the time (for the average power figure). MacPitts considers 
both the number of depletion transistors and the powE-r 
consumed by the circuit v*jires, so the MacPitts power should 
be the more accurate of the two. The command to use Powest 

c 



on the .sim file is 



powest “p < basename. si m 

Where the -p switch directs Powest to print out informative 
data about the circuit, and the < is the Unix backwards 
redirect, which directs the .sim file to Powest- Appendix A 
shows the result of a Powest analysis of the five input data 
path AND gate- Checking the Powest result can also serve as 
a check on the accuracy of Nextra's nodal extraction- For 
example, from Figure 3.2, the cifplot, there are eight 
depletion pull up transistors and no enhancement pull ups or 
special transistors- The Powest analysis in Appendix A 
confirms this count- This transistor count verification is 
important in a MacPitts data path design analysis. It has 
been observed that the Vdd bus (top metal trace. Figure 3.2) 



does not always connect 


with the vertical 


1 1 nes 


to 


the 


pul 1 up 


transistors- The 


gap is so 


smal 1 


that it 


i s 


not 


usual 1 y 


evident in Caesar 


, although 


a desi 


gn rule 


checker 



such as Lyra will detect it- 

B- CONTROL PATH FIVE INPUT AND GATE 

Chapter II discussed the two different types of 
control path five input AND gates possible- The COND driven 
AND gate was structurally more complicated (Figure 2.40), 
while the “CONDless" AND gate was compar at i vel y simple 
(Figure 2-43)- The COND driven AND gate is more likely to 
occur in practice (since the purpose of the Weinberger arrays 



is decision making, or conditional control ) , so that circuit 
is analyzed in this section. 

Figure 3.6 is the MacPitts driver which creates the 
control path to implement this logic. Figure 3.7 is the 
resulting Weinberger array, which has had the 
odd_part i al _gate input gap errors repaired in Caesar (so 
Lyra and hence Mextra will work, and produce a valid .sim 
file). Figure 2.41 is the logic gate equivalent of the 
Weinberger array. Appendix A contains the object file for 
this chip. The NOR character of the Weinberger array logic 
was discussed in Chapter II, and in the LISP object file all 
logic is done with NORs. Appendix A also contains the LISP 
object file for the equivalent data path function, and in 
Figure 3.2 all logic is implemented in AND organelles. The 
Weinberger array is composed of inverters also, but an NMOS 
technology inverter is just a degenerate (single input) NOR 
gate. The difference in implementation from a softwarE^ 
(language) perspective is that . the data path function is 
done in organelles, and the control path function is done 
exclusively in NORs. The data path organelles are already 
compiled in the or ganel 1 es . 1 i sp files, so MacPitts has to 
work harder to create the equivalent function in the control 
path. Both the basename.obj file and the cifplot of the 
Weinberger array show the NOR logic implicit in control path 
comb i nat i onal logic- The MacPitts script file is shown in 
Appendix A, and its data path counterpart is also for 
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Figure 3.5 Gate Equivalent o-f D.P. AND Showing Delays 
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Figure 3.6 Control Path Five Input AND Gate .mac File 
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comparison- These -files contain in-formation which will be 
compared in the next section- 

The same CAD tools were used on this circuit as were 
used on the data path circuit, in the same order- Mextra 
produces the -nodes -file (Appendix A) - The control path 
logic also dif-fers from the data path logic in the number of 
nodes produced to model the equivalent circuit- fhe 
Weinberger array node list is appr ox i matel y 257- larger than 
the equivalent data path node list- Appendix A contains the 
Crystal analysis of the circuit, and the critical path file 
for source input to the Caesar file- Figure 3-8 depicts the 
Weinberger array with the critical nodes marked, and Figure 
3-9 is the gate level equivalent of Figure 3-8 with delay 
node values and gate equivalent fan-ins marked- Appendix A 
contains the Powest analysis of the control path AND gate, 
and this information is incorporated into the following 
table for comparison- 

C . SPEED-POWER COMP AR I BON ■ 

Table 3-1 compares functionally equivalent MacPitts five 
input AND gates in both their control and data path 



conf i gurat i ons. 
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Figure 3.7 Weinberger Array From C.P. Five Input AND Gate 
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Figure 3.8 Weinberger Array With Critical Nodes Marked 
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Figure 3.9 Gate Equivalent of Weinberger Array With Delays 



TABLE 



1 



FIVE INPUT AND GATE 



DATA PATH 


CONTROL PATH 


MacPitts power 
CW3 .0407 


. 038 1 


Powest power 
average , CWH .00182 


. 00094 


max i mum , C W H . 00245 


. 00188 


Maximum delay 
Crystal , Cnsl 81.15 


85.98 


Length x width o-f logic 
[lambda] 209 x 173 


c i r c u i t 

386 X 113 


Number pull ups 
(less pads) 12 


8 


Compile time 
[CPU mini 2.106 


1 , 535 


CPU peak memory demand 
Ckbl 349 


357 



So all other things being equal, the data path circuit 
is superior to the control path circuit in terms of power 
consumption, size, and compile time in MacPitts, and 
slightly interior in terms of maximum speed attainabien 

The data path power advantage is under st andab 1 e when the 
number of depletion pull ups there is compared to the number 
in the control path- A power consumption ratio of 0-67 is 
expected, and the calculated r^tio is close to that. The 



( . j 



di -f -f erence is explained by the long horizontal polysilicon 
runs in the Weinberger array, which have a comparat i vel y 
high specific resistance (ohms/square), and therefore 
consume more power.. The first row in the table above, 
MacPitts computed power, is calculated on the whole chip and 
not on the just logic circuitry. This value shows a similar 
power consumption rel at i onshi p , but the poly runs connecting 
the Weinberger array to the rest of the circuit consume 
additional power (the rest of the analysis in the table 
above is done on just the logic circuits, and not on the 
whol e chi p ) - 

The speed of the two circuits is appr ox i matel y the same- 
Figures 3.4 and 3.8 show the Crystal -generated delay data on 
the data path and control path circuits. The results are 
perhaps clearer in Figures 3.5 and 3.9, the logic gate 
equivalents of the cifplots. In the data path (Figure 3-4), 
the signal experiences appr ox i mat el y 21 ns delay per 
organelle- The organelle comprises a NAND gate and an 
inverter (Figure 2-14). From the gate equivalent and the 
Crystal script (Figure 3-7), each NAND gate induces a delav 
of 9-4 ns, and each inverter induces a delay of 11.4 ns. The 
circuit shown in the gate equivalent is expected to produce 
a delay equal to the product of the number of organelles and 
the delay per organelle- The expected delay is then 4 x 20. S 
= 33-3 ns. The cifplot (Figure 3". 2) reveals v^^here the added 
three ns delay arises. The river routing routine in MacPitts 



runs the input and output lines in polysilicon, and in this 
case the output comes -from across the circuit. The specific 
resistance and capacitance o-f polysilicon and the poly input 
and output line lengths constitute this added delay- Froede 
CRe-f. 3;pp. 72-763 has validated Crystal's timing 

calculations and compared them -for accuracy with the theory 
presented in Mead and Conway CRe-f. 4:pp. 3-143, 

Figure 3.8 is the cor r espond i ng data path ci-Fplot with 
Crystal delay annotation -for the Weinberger array. The 
structure o-f the Weinberger array is, at first glance, 
intimidating. Two observations on -function assist in 

under standi ng the structure. (1) Any GND track that connects 
a Vdd track with only one di-ffusion gate is an inverter, and 
(2) any BND track that connects a Vdd track with multiple 
di-f-fusion gates is a multiple input NOR gate. The transverse^ 
poly runs turn on and turn off the NOR gates and inverters. 
This cifplot shows six inverters and six NOR gates. 
Furthermore, multiple input /single output Weinberger arrays 
appear to always exhibit the four level structure shown in 
Figure 3.9, a bank of inverters followed by a bank of 
multiple input NORS followed by a single multiple input NOR 
followed by an output inverter. Figure 3,9 is the gate level 
equivalent of the Weinberger array in Figure 3,8, with de?lay 
annotation and fan-in (shown inside the bodies). The 
critical path is from input A to the second level nine-input 
NOR througn the output NOR through the output inverter- The 
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Weinberger array total delay is then 81.15 ns, not much 
dit-ferent from the data path circuit delay. This delay 
calculation only considered the Weinberger array, however, 
and not the connections to it which MacPitts creates in 
polysilicon. If these additional connections were 
considered, the Weinberger array would certainly be slower 
than the equivalent structure in the data path. Figure 3.8 
shows the critical path (annotated with cumulative delay 
times), and it is evident that the longest delay path occurs 
along the wires which must charge the largest capaci tances., 
The data path block is connected to the rest of the chip 
with metal lines (in most cases) , so this added delay from 
polysilicon runs would not apply to it- 

The relative sizes of the dafia path and control path 
circuitry are as expected from the object code respective 
descr i pt i ons . The object code for the data path 
i nstant i at i on is appr ox i matel y half the size of the code for 
the control path. From a theoretical vi ev'^^poi nt , the cascaded 
AND organelle circuit is more conservative of both silicon 
and power than is the Weinberger array. This principle 
applies to most comb i nat i onal logic in NacPitts, since the 
Weinberger array builds functions only from NOR gates., 
whereas in the data path the choice of building blocks is 
larger (NAND, NOR, and inverter). The MacPitts chip size 
comparison is given in the table above, but the circuit. 



dimensions are more informative. The data path circuitry has 



an area o-f .090 square mm, and the Weinberger array covers 
% 

.109 square mm, an area o-f 120 7. over the data path 
•f unct i onal equivalent. 

The compile time for the control path chip is 
approximately 257 greater than for the data path chip- This 
is understandabl e in light of the gate i nstant i ati on process 
for each path- From the cifplots in Figure 3.2 (data path) 
and Figure 3.7 (control path), the circuits are not even 
remotely similar str uct ur al 1 y . The data path circuit is made 
from quadruple instantiation of the MacPitts library AND 
organelle (see Appendix A, the object code). This organelle 
is accessed four times, its location calculated, and then 
it is instantiated. The control path Weinberger arrav 
(Figure 3-7) requires time consuming decisions and 
construction from more primitive units, NOR gate inputs (see 
the object code. Appendix A)- The poly cross-runs must then 
be laid down- All of these processes are computat i onai 1 y. 
intensive, and this is why large contr ol -heavy Weinberger 
array archi tectures take a long time to compile- Chapter VI 
describes the design of a control path chip and hov*J long ir. 
required for compilation- 

D. ALTERNATE POSSIBILITIES FOR FIVE INPUT AND GATES 

The five input AND gate, as implemented by NacPitts in 
both its data path and control path conf i gur at i ons , has been 

Each conf i gurat i on can be improved in the 

■|. • 



ex ami ned 



above - 



areas oi speed and circuit density. While the goal of 
% 

silicon compilation is to free the designer from excessive 
preoccLipat i on with detail, perhaps the combi nat i onal logic 
generation by MacPitts can be improved. The following 
section presents two hand-designed variants of the five 
input AND gate for comparison with the MacPitts designs. 

The first design is patterned after the Mead-Conway 
cells as illustrated throughout CRef. 4D. The layout is 
similar to that generated by MacPitts for the five input 
data path AND gate, a linear cascade of NANDs and inverters. 
Figure 3.10 shows the hand-crafted circuit. It is noticeably 
different from the MacPitts design in two ways- The pulldown 
transistors on the NAND gates are four lambda wide. This 
allows a shorter data path, while preserving the 4s 1 aspect 
ratios of the transistors. Also, the character i st i c MacPitts 
pull up diffusion “ dogleg" is absent- This is accomplished by 
joining the pullup diffusion and polysilicon layers.with an 
in-line buried contact. The circuit is also less wide? than 
the MacPitts equivalent. MacPitts uses NAND organelles, and 
interconnects then with metal /pol >Vd i f f usi on wires- This 
wastes a lot of space. In the hand-designed five input AND 
gate, the output is taken from the pullup on a pol v'si 1 1 con 
wire, and routed directly to the input of the next 
transistor. This saves (at a minimum) two contact cuts in 
the transistor i nt erconnect i ons . As expected, this 

is also considerably faster than the MacPitts 



conf i gur at i on 



equivalent. The MacPitts data path -five input AND gate 
requires 86 ns tor signal propagation, and the handcrafted 
design requires 22 ns. Figure 3.11 shows the gate equivalent 
of the hand design, with propagation times marked above the 
respective gates. 

This conf i gurat i on is amenable to silicon compilation if 
the NAND-NOT pairs as shown are incorporated .into the 
MacPitts organelle library as an AND organelle. Similar 
speed and area enhancements are expected for other data path 
logic gates. 

If the multiple input AND gate can be improved so much 
using the basic MacPitts data path cascading scheme, does a 
better method exist using another approach? The drawback to 
the cascading scheme is the linear pi leap of transistors. 
This requires more silicon, and consequently more current to 
charge the gates of later stages. A better design would use 
only one gate for the five input AND function, as shown ■ in 
Figure 3.12. This is a true five input AND gate, as opposed 
to the previous circuits which only enriulate the five input 
AND function. The circuit is much smaller than the previous 
five input AND gates, and is much faster. Figure 3.13 is the 
gate equivalent with marked delays. This circuit i. s 
patterned after those circuits illustrated in L’Ref. 411 also. 
The wide (10 lambda) pulldown region permits a compar at i vel y 
short transistor (i. e. , the pullup aspect ratio is not very 



1 arge) - 



The multiple input NAND and NOR 



der 1 vat i ves 







Figure 3-10 Mead-Conway Style Five Input Linear AND 



1 10 










patterned after this gate should be simple to incorporate 
into the silicon compiler. The only decisions required are 
how many inputs (set by the designer), spacing of the input 
wires (set by the design rules), and pulldown diffusion 
column width (must be calculated as a function of number of 
input wires to the gate). If a silicon compiler is desired 

9. 4ns* R 




Figure 3.11 Gate Equivalent of Figure 3.10 With Delays 
which produces fast, compact combi nati onal logic circuitry, 



this 


method 


shoLil d 


be consi dered . 


Table 
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the 


data 


path AND 


gate (DP) , the control 


path 


AND gate (CP) , 


the 


hand- 


-craf ted 


1 i near 


cascaded AND 


gate 


(LC) , and 


the 



multiple-input AND gate (MI). 
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3.12 Compact Five Input AND Gate 
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5.13 Gate Equivalent o-f Optimal Geometry Five 
Input AND Gate Showing Delays 



TABLE 3.2 



COMPARISON OF 
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DP 


CP 


LC 
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1 


2. 45 


1.88 


2.0 


. 5 


81 . 15 


85.98 
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IV. SEQUENTIAL LOGIC IN NACPITTS 



Based on previous analysis, combinational logic in 
MacPitts is done better (i.e., more efficiently, when a 
choice exists) in the data path than in the control path. 
Does the possibility of improving NacPitts' sequential logic 
performance exit also? A study of this question presents 
interesti ng probl ems. 

A. AN OVERVIEW 

Chapter II discussed two different ways of increasing 
throughput, the PAR form and the COND form. There exists 
also a method of global parallelism available to thE? 
MacPitts programmer, the PROCESS form. The PROCESS form has 
the syntax 

(process <process name> <stack depth? ... ) 

where the process name is an arbitrary ASCII character 
string (if the name is made short, then the VT~ 1 00/ ADM-3A 
interpreter screen can displE<y them all). fhe stack depth 
refers to the depth of subroutine calls for which this 
process must push return addresses onto its program 
counter LIFO stack. MacPitts syntax requires the designer 
to determine this stack depth a priori, and to explicitly 
state it to MacPitts (the silicon compiler). The stack 
depth is a required field in the PROCESsS statement, and 



i i. .! 



may be any integer including zero- Each process has its 
own stack, and all processes are executed in parallel. 
This parallelism provides a high throughput on a properly 
designed al gor i thm- 

An extension of the digital home temperature controller 
of Chapter II might also control other aspects of the home 
environment. For instance, it would be desirable to turn the 
security lights on and off by a photoel ec tr i c cell signal, 
to start the coffee brewing and the microwave oven cooking 
dinner at a timer signal, and to keep the 1 awn‘ appropr i atel y 
watered by turning the sprinkler on upon a moisture detector 
signal. The following MacPitts program outline would 
accomplish these tasks. All logic is done on Boolean 
variables, flags for storage and signals for sensor inputs. 



(program house <word size> 



<por t , signal, register, and flag assi gnments > 



(process lite 0 

(setq lights (not photo_cel 1 _i nput ) ) 
(process food 0 
(cond 



(si x_am 

(setq 
( seven_am 
( setq 
( f OLir 45_pm 
( setq 
( f i ve_pm 

( setq 
( f i ve30_pm 
( setq 



mrcof f ee t ) ) 
mrcof f ee f ) ) 
put_di nner _i n t ) ) 
microwave_on t)) 
microwave on f)) ) 



(process environ 0 
(cond 



I i 



(hot 



(setq 

(setq 

(setq 

(col d 

(setq 

(setq 

(setq 

(t 

(setq 

(setq 

(setq 

(setq 

(process grass O 

(setq sprinkler 



•fan_on t) 
window_open t) 
doors_open t)) 

heater_on t) 
window_open f) 
doors_open f)) 

heater_on f) 
f an_on f ) ) 
window_open t) 
doors_open t)) ) 

on (not lawn moist)) ) 



(process clock 1 

(par(call mod60) (setq time counter _out ) ) ) 



mod 60 

<a modulo sixty up counter al qor i thm >( return ) ) 



All of these processes are done in parallel- Ail of the 
processes have a stack depth of zero except for the clock 
process, which has a stack depth of one. This is necessary 
due to the clock process calling a subroutine, the modulo 
sixty up counter- The cal 1 of the counter and the following 
SETQ are paralleled with the PAR construct- This PAR 
paralleling appears to work well for cases where the output 
depends on the called routine, like the example above- If 
the dependency is reversed (for instance, paralleling SETQs 
of inputs to a slow multiplier subroutine with the CALL to 
that multiplier) some unpredi ctabl e results can arise- A 
good practice is to emulate all time-dependent algorithms 



alone in the interpreter prior to their i ncorpor at i. on 



into 



the MacPitts algorithm. In so doing, syntax errors may be 
■found and -fixed and the algorithm may be optimized -for 
number o-f cycles required to execute. 

For -fast archi tectures , some additional speed can be 
gained by paralleling the subroutine outputs with the 
RETURN -from the subroutine. For instance, the mod60 
counter-t i mer in the previous example is called as a 
sub r out i ne. 
mod 60 



(par(setq counter_out count ) (return) ) 

There exists no time-dependency between the -final result 
(counter_out ) and the RETURN to the main program, so no 
data latency results -from this paralleling. 

To re-emphasi ze , all o-f the PROCESSes under the PROGRAM 
statement execute in parallel. So while the <house> chip is 
monitoring temperature and time, it is simultaneously 
monitoring lawn moisture, setting the house clock, and 
checking the outside light level. PROCESSes execute 
independently, in parallel. Each PROCESS has its own 
independent stack, and processes do not communicate 
internally with each other. From the hardware standpoint, 
each process is an independent MacPitts entity sharing data 
storage elements and signal wires. 



In this somewhat arti-ficial example, 



there is no strict 



requirement -for speed- I-f the lawn is watered 50 
microseconds late, the grass will still grow. But the 
principle o-f global process parallelism applies to more 
complicated digital systems where intricate timing 
i nterrel at i onshi ps exist. It is also evident that MacPitts 
is a very versatile silicon compiler. A chip constructed 
-from a similar mul ti -process algorithm could be used to 
control many ot-f— chip processes simultaneously. The 
intrinsic nature o-f the PROCESS -form lends itsel-f well to 
applications such as industrial digital control. In 
situations where the PROCESS statement is used to force 
parallelism but the parallelism is not needed (for instance, 
the <house> algorithm), MacPitts creates a large lav'out- 
Silicon area is traded off for speed. 

This algorithmic outline illustrates using PROCESSes in 
a combinational logic machine- PROCESSes are required around 
any invocation of a subroutine, but aside from this 
consi derat i on , the <house> chip could be specified just as 
well without PROCESSes- 

PROCESSes are required, however, to describe a 
sequential logic machine in MacPitts- The FSM architecture 
is explicitly specified by the PROCESS form. The PROCESS 
statement implicitly specifies creation sequencers (a data 
path hardware organelle, which steps the FSM through its 
states) and their instantiation in the data path- 
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B- GRAY CODE TO BINARY DECODER 



The -following section illustrates the MacPitts design of 

a simple sequential logic system. The Gray code CRef.5: 

p.97] finds many diverse uses in electrical engineering and 

computer science. Whenever a single bit change in successive 

data words is desired, (disk sector addressing, radar 

antenna positioning) the Gray code should be considered. In 

finite automata theory, the Gray code decoder can be 

regarded as a sequence detector. The desired sequential 

machine complements the input on having received an odd 

number of earlier I's, and does not complement the input on 

an even number of I's. An example sequence is 

input: 1 1 1 1 0 0 0 0 1 0 1011 0 0 1 ... 

output: ~ 0 1 0 0 0 0 0 1 1 0 0 1 0 0 0 1 ... 

The Gray code decoder can be implemented in MacPitts as 
a Mealy FSM to detect this sequence, and set the appropriate 
outputs. The automata for the Gray code decoder is shown in 
Figure 4.1. The node label MSB3 indicates most significant 
bits, COMPL means complement the present bit, and NEIXTBII' 
means consider the next bit. 

1 - A1 qor i thm D esi qn 

The next consi derat i on is algorithm design, previous 
experience inclines the designer toward a data path 
architecture (faster, smaller, less power consumption). 
Furthermore, a data path chip would probably have a greater 
throughput, since the operations could be done on words, and 







Figure 4.1 Gray Code Decoder State Transition Diagram 
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not individual bits (e. g., a parallel Gray code decoder, 

% 

which decodes on a word basis rather than a bit-by-bit 
basi s) . 

The problem with this approach is that MacPitts 
permits no explicit, succinct method o-F setting the 
individual bits in a word. The bits can be tested with the 
BIT expression, but not set. So a control path (implying 
Boolean type data and Weinberger array combi nat i onal logic) 
architecture is probably a better choice. 

A control path FSM can be designed with MacPitts 
(even though no explicit data path is used). The reason is 
the way in which MacPitts implements FSM state 
transi t i oni ong with the sequencer organelles. The sequencer 
can be thought o-f as a bank o-f n sequencer organelles, where 
n is the data path width speci-fied in the PROGRAM statement. 
The sequencer organelles are physically adjoined to the data 
path organelles in the MacPitts chip. The sequencer stores 
FSM state, much in the same way as -flip-tlops store state in 
a discrete-chip FSM design. And just as two raised to the 
power (number o-f -f 1 i p-t 1 ops ) limits the states in a discrete 
digital system, so two raised to (number o-f sequencers) 
limits the states possible in a MacPitts sequential machine. 
The number o-f sequencers is always equal to n, the data path 
width. This has rami f i cat i ons tor MacPitts designers 
considering a system ot many states with a narrow data path. 
The possible number ot states is limited to 2-x-x-n. 
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One solution to the Gray code problem is to use a 

data path archi t ecture , to declare the data path width as 

two, and to specify an extra (unused) bit in the input and 

output port declaration statements. The most significant 

bit of the input port is obviously extraneous, but the data 

path width of two is necessary to address the three states 

required (Figure 4-1). When the Gray code chip is used, 

these extra pins must be tied to ground. If a data path 

width of one is specified (and PORTS are used for inputs) 

instead, MacPitts gives the following diagnostic 

Error — Word length too small to store the state for 
this process 

If the data path width is left as two, but the input and 
output ports are left only one bit wide (another attempt 
to circumvent this problem), MacPitts responds with 
Error-Inval i d port definition 

which means that the data path width was declared as two, 
but the port is only one bit wide (MacPitts has helpful 
di agnost i cs) - The MacPitts source code file, extract.lisp 
(under the def get-sequencer -f rom-pr ocess macro) shows why 
this constraint exists. The sequencer width is explicity 
set to the data path width. 

Figure 4-2 shows the MacPitts driver code to do the 
Gray code to binary conversion serially- The MacPitts 
algorithm shown in Figure 4.2 has the lines numbered for 
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reference, but the numbers are not part of the allowed 
MacPitts syntax- Line 1 is the title, using a semicolon as 
the reserved word comment designator. Line 2 is the PROGRAM 
statement, the program name is gc (Gray code) and the data 



1 ;Groy Code to binary conversion algorithm 
;Th1s code illustrates the Data Path (1. e., 

;Integer> solution to the problem. It is but one 
;Var1ant of many possible solutions. 

;Define the data path width as 2 (state transitioning) 

2 (program gc 2 

3 ( def 1 ground) 

4 { def 2 ph i a ) 

5 (def 3 phib) 

6 ( def 4 ph 1 c ) 

;A11 FSMs must have a RESET input (for initialization) 

7 (def reset signal input 5) 

;Use INTEGER (port) input & output, 2 bits wide 

8 (def i np port input (6 7)) 

9 (def bin port output (8 9)) 

10 (def 10 power ) 

;Specify FSM architecture 

11 (process grycod 0 

12 msbs ; (Most Significant Bits) 

13 <cond((=0 1 np ) ( setq bin 0)(go msbs)) 

14 ({*1 inp)‘(setq bln 1 ) ( go compl))) 

15 compl ; (COMPLement bits) 

16 (cond((=0 inp)(setq bin 1 ) ( go compl)) 

17 {(= 1 inp)(setq bln 0)(go nextbit))) 

18 nextbit; (NEXTBIT in string) 

19 (cond({=0 inp)(setq bin 0)(go nextbit)) 

20 ((= I inp)(setq bin l)(go compl))) ) ) 



Figure 4.2 Gc.mac 



path width is two- Lines 3, 4, 5, 6, and 10 are standard , 
and required by MacPitts conventions. Line 7 is required for 
all FSMs, and when it is raised high (positive logic 
arbitrarily chosen here), the FSM/PROCESS is reset to its 
initial state- Line 8 defines the input port, i np , and 
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declares it integer two bits wide- 



Line 9 does the same tor 



the output port, bin (binary value). Line 11 specifies FSM 
architecture with the PROCESS statement, for which the stack 
depth is zero (no calls to subroutines). Line 12 is a node 
label, msbs (most significant bits), and represents the top 
node in Figure 4.1. Line 13 is the first check in this 
state, and says that if the input equals zero, then set the 
output to zero and go to node msbs. If the input does not 
equal zero, then go to the next line of code. Line 14 checks 
whether the input equals one. If the input is equal to one, 
the output is set to one, and the program transitions to 
the complement (compl ) state. Line 15 implements the second 
node in Figure 4.1, complementing the input. Line 16 checks 
the input, and if it equals zero it complements and keeps 
complementing as long as the input equals zero, and if not, 
it proceeds to the next line. Line 17 checks for the 
sequence of an even number of ones, and if true, sequences 
to the next node after complementing the input. Line 18 is 
the label cor r espond i ng to the last node in Figure 4-1, 
ne>(tbit- Line 19 checks the input bits, sets the output to 
the input value, and returns to this node as long as the 
input is zero. Line 20 also sets the output to the i np)ut 
value, but jumps back to the bit complement node when the 
input is one. The conditional in line 17 is unnecess^^ry , but 
is included for clarity (If the non-storage port, bin, is 

it will become zero at the next 



not explicitly set to one, 



state transition. Line 17 can be eliminated, and the 
algorithm will work correctly anyway) . 

The next step is to test and debug the algorithm in 
the interpreter prior to full compilation. The Gray code 
algorithm was debugged in the i nterpr eter , and compiled with 
the <herald> option. Appendix B shows the script recording 
of the compilation process, and indicates a data path of 
seven different organelles (to be discussed in the next 
section) and a moderate-sized (31 columns) Weinberger array. 

Figure 4.3 shows the chip resulting from the 
compilation of gc.mac. The functional constituents of this 
layout will be treated qual i tat i vel y in the next section. 

2. Funct i onal Const i tuents Of The Chip 

The layout scheme of MacPitts places general 
functional blocks in specific relative locations on the 
chip. Figure 4.4 indicates where these relative locations 
lie on the cifplot- The block sizes shown in Figure 4.4 are 
arbitrary, since the actual sizes depend on a combination 
of algorithm and MacPitts (the source code). In compciring 
Figure 4.4 to Figure 4.3, it is seen that this chip has no 
flags, which is expected since none are defined in the 
source algorithm. The rest of the blocks shown in Figure 4.4 
are instantiated in cg.cif (Figure 4.3). 

The data path arithmetic block is shown in Figure 
4.5. The function of this unit is to operate on the inputs 




Figure 4.3 Gc.ci-f 
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Figure 4.4 MacPitts Layout Scheme 
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so that the desired outputs result- The inputs enter the 
arithmetic block and the outputs exit as shown in Figure 
4-5- Between input and output, the data is subject to 
switching and various logic operations- The data path B.nd 
the control path must also communicate with each other over 
the i nterconnect i ng traces- The le-ftmost top poly line, 09, 
i s an input to the Weinberger array, vMhere it turns on -five 
NOR gates- Similarly, the other nine lines also connect to 
the control path- Lines 08, 07, 05, 04, 03 (reset), 02, 01, 
and DO are outputs -from the control path and inputs to the 
data path- Line D6 is the other output -from the data path to 
the control path- The inputs to the data path can ne 
understood as relay controls, or switches- The outputs from 
the data path to the Weinberger array are Boolean values to 
cause decisions about what to do next- 

From Figure 4-5, the arithmetic path of this chip is 
seen to be two bits wide (the two horizontal parallel 
organelle chains)- In Chapter II it was shown that syntax 
implicitly controls i nstant i at i on - Line 13 in the Gray »::ode 
algorithm speci-fies two data path operations 

(cond((=0 inp)(setq bin 0) 

where the (=0 inp) is a logical comparison integer test, and 
(setq bin 0) is an integer -form by de-finition o-f bin in the 
de-f statement and the source -for bin being an integer, zero- 
The leftmost set of cascaded OR gates makes the (=0 inp> 

1 .3 



test, and signals the control path on line D9. Figure 4-6 
shows the logic diagram -for this stipple plot, and the 
results tor a zero input- 

Proceeding .right on the arithmetic block stipple 
plot, the next block is a set ot paralleled NOR gates. The 
inputs are the inp bits, inpO and inpl, and Vdd and GND- The 
output is a signal to the control path -from D8 which 
determines the chip output, bin (BINary equivalent ot the 
Gray code bit stream). This circuit does not directly make 
the output assignment, (setq bin 0), but rather does it 
through combi nat i onal logic in the Weinberger array. Figure 
4-7 is the logic diagram of the setq operation circuitry- 
The circuit is annotated to show a zero bit input on inp, in 
which case a TRUE is sent to the control path on line DS- 

Proceeding right in the data path, the next two 
blocks in Figure 4-5 shov*^ pass transistor units- The 
leftmost pass transistor unit has inputs from binO, binl, 
and control on D7- The output is a signal to control on D6« 
This section of the data path is where the output bin is 
set, although the logic for setting bin is determined in the 
proceeding two data path units and the control path. To the 
right of this unit is another pass transistor block which 
takes inputs from the previous pass transistor unit, from 
the clock drivers, from control on lines 05, 04, and 03, and 
from the sequencer- The function of this unit is state 



tr ansi t i on - 



The 



sequencer 



i nput 



represent the current 



GND 

inp0 



inpi 



0 




D9 



Figure 4-6 Test Logic tor (=0 inp) 
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Figure 4-7 SETQ Operation (Signal to Control) 
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state, and this unit drives the state registers which signal 



next state to the sequencer tail, at -far right. The input D3 
is the reset signal, which implements the MacPitts function 
of returning the FSM to its initial state when raised high. 

Figure 4-8 shows the state registers, a set of 
parallel 2-T memory cells, in which the current state is 
held- The inputs to the state registers are the outputs of 
the previous pass transistor block, signalling next state 
transition, and the three clock lines from the clock driver. 
The outputs are the two state bits (SO and 31) to the 
control path (on lines marked Cl and CO, Figure 4-10). The 
Mealy FSM methodology is evident in MacPitts from both the 
algorithmic and hardware viewpoints- The output is a 
function of both input (inpO, inpl) and present state (SO, 
SI) - 



Below the state registers in Figure 4-3 are the 
clock drivers- Figure 4-9 is a blowup of the driver 
organelles, used for buffering the clock signals and 
generating the five overlapping clock signals- The drivers 
are turned on by a signal from the Weinberger array, C5n 
Carlson describes the clocking scheme and the reasons behind 
its choice (Ref» 2:p» 26). 

The rightmost block in the data path is the 
sequencer- Figure 4-10 is the cifplot of the sequencer 
combi nat i onal logic, and Figure 4-11 is its gate equivalent. 
The sequencer has as its inputs the current state (SO, Bl) 
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Figure 4.8 MacPitts State Registers 
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Figure 4-9 Clock Drivers and Five Segment Generator 
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and produces as its outputs the next state (SO+l , Sl+l). 
The gate diagram o-f the sequencer answers the question asked 
in the initial design o-f the Gray code decoder, why three 
states are not allowed in a control path (i-e., a data path 
width o-f zero) archi tectur e. The answer lies in the implied 
data path structure, as explained previously and as 
graphically shown in Figure 4.10 and Figure 4.11. The data 
path width as speci-fied in the PROGRAM statement sets the 
number o-f sequencers to be instantiated, and the number at 
sequencers limits the number o-f states possible. It fewer 
FSM states are required than the sequencer depth can 
transition to, the sequencers are nevertheless instantiated, 
but their outputs are not connected to the control path ‘;Co 
and Cl in this example). For example, this would occur for a 
wide data path which had few states- If a data path FSM chip 
were designed with a v*^ord length of five , and only four 
FSM states were needed, Mac Pitts would instantiate ail five 
o+ the sequencer organelles. Only the top two would be 
connected to the Weinberger array- Figure 4-12 is a block 
diagram of the MacPitts sequencer organelles, and shows how 
the Mealy FSM is implemented. The multiplexers on each side 
of the state registers determine that the next state is a 
function of both present state and present input. The 
Weinberger array controls the gating in the multiplexers to 
allow the appropriate signals to pass to the skate 
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Figure 4.10 Sequencer Tail 






Si’ 



50’ 



Udd 

50 



51 




C2 Cl C0 



Figure 4.11 Sequencer Tail Logic Diagram 
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Figure 4.12 Mealy 
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in MacPitts 
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The Weinberger array is the control path in a 
MacPitts chip, -for reasons explained in Chapter II. The 
Weinberger array is shown in Figure 4.13, and its labelled 
gate equivalent is Figure 4.14. In the ci-Fplot, all input 
and output columns have been labelled (A~Z) tor comparison^ 
The output lines have also been labelled (Cn) tor reference 
to the other tunctional blocks of the chip. There are major 
differences between this multiple function Weinberger array 
and the single function Weinberger arrays considered 
previ ousl y - 

This Weinberger array for single output functions 
always has a four level structure, i nver ter ~WuR~NOR"- 
inverter- This is not the case for multiple output 

Weinberger arrays. This circuit has 11 inverters and 15 NOh 
gates. The maximum fan in on any NOR gate is six. In the 
previous Weinberger arrays, the maximum delay was 

appr ox i mat el y four gate delays- In this Weinberger array, 
the longest path is shown in Figure 4.14 as J~U~T~L~F“0--D , 
or Q~W~T~L““F-“G~*D , Each path induces appr ox i mat el y seven gate 
delays. The MacPitts script session (included in Appendix 
lists the control depth (NOR gate nesting) incorrectiy as 
four. Furthermore, the polysilicon runs cover proper t.i onal 1 v 
more area in this Weinberger array than in the previous 
single function ones. From Chaipter III, the polysilicon to 
substrate capacitance is a strong factor in limiting chip 
speed. The multiple function Weinberger arrays are expected 




Figure 4.13 Weinberger Array -from Gc.ci-f 



140 







cia 













"A A 




A A, 





Figure 4.14 Gate Equivalent o-f Gray Code Weinberger Array 
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to be slow. 



This Weinberger array has nine outputs 



(Cl 1 , 



CIO, C8 , C7, C6 , C5 , C4 , C3, and C2) and -five inputs (C13, 
C12, C9 , Cl, and CO). C13 is a check on the input signal 
values, and comes -from D9 (D indicates a signal to or from 
the data path, C indicates a signal to or -from the control 
path). Cll is an output to D8 in the data path, the function 
of which is not clear (data path output connecting control 
path output). CIO is an output to D7, and the signal 
controls pass transistor gating in the left pass transistor 
unit, which determines the value of the output (binO, bin!). 
C9 is an input to the Weinberger array, and comes from 06. 
This input is not set within the data path, and it is likely 
that it results from MacPitts' expectations of a more 
complicated structure. The sequencer organelles exhibit this 
vestigal structure property also, as previously mentionea. 
C8 , C7, and C6 are outputs which control the second pass 
transistor block (state register multiplexer) in the data 
path. They connect to D5 , 04, and 03, r espect i vel , and 
control the sequencer's next-state transi t i oni ng . C6 is 
connected to pin five by a polysilicon run and C13, so C6 
(03) is the reset signal. C5 is an output which turns on the 



c 1 oc k 


dr i ver s . 


C4, 


C3, 


and 


C2 are outputs connecting 


the 


d a t a 


path at 


D2, 


D1 , 


and 


DO, where they control 


pass 



transistor gating for the sequencers and state register- Cl 
and CO are inputs from the state register which represent 
the current state. Figure 4.15 shows the data path-cont r ol 



.1. •• 



path interconnect! ons. 



The i nterconnect i one are 



summar i z ed 



in the diagram below. 
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In this diagram, rst means reset, PT is a pass 
transistor unit, ves is a vestigal (non— f unct i onal ) unit, 
seq is sequencer , and reg is the state register. 

3 . A1 ternate Designs 

The gc.mac algorithm used explicit value assignment 
in the output setq -forms. 

(setq bin <value>) 

In this case, it is possible to explicitly set the output 
to a value (one or zero). This is not possible, however, 
-for all algorithms, and is not even desirable in the 
general case. Usually the output is a -function o-f the 
input (s), and not a speci-fic value which is known 
be-forehand. With this in mind, an alternate algorithm was 
written to implement the Gray code to binary conversion- 
Figure 4.16 shows the algorithm, gc2.mac. This code 
-follows the state diagram given -for gc.mac (Figure 4-2), 
and the states all have the same names- The algorithms are 
equivalent -functionally and semantically (they both do and 



I 



I 



Figure 4,15 Data Path/Control Path Inter connect i on 
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say the same thing). The only ditterence is in the binary 
output (bin) setq -forms. In the previous algorithm, 
gc.mac, the output bin is explicitly set to either one or 
zero. In this algorithm, gc2.mac, the output is set to a 
data path -function o-f the input. The code represented by 
gc2.mac represents the more general case. 

The chip created by gc2.mac is expected to be larger 
than the one created by gc.mac, since additional data path 
decisions are required in the setq -forms- The script iil& ot 
the gc2 MacPitts session, (Appendix B) verifies this, and 
Figure 4.16 shows the resulting layout. In comparing the t\^o 
script files, it is seen that gc2 requires more data path 
units, data path transistors, and control path transi stor s . 
This is reflected in the comparative complexities of the 
data paths in Figure 4.3 and Figure 4.16. The chip produced 
by gc2 would also consume slightly more power and be 
slightly larger than the chip produced by gc- The conclusion 
is that by explicitly specifying the setq destination 
values, the designer can save area and power consumption- A 
reasonable expectation would also be a faster chip. 

Explicit assignment of outputs is therefore 
desirable, though not always feasible- In many control path 
archi tectures , where the output is treated as individual 
bits, explicit assignment is possible (though not always the 
optimal solution, see Chapter VI on Hamming error- 
correctors, v^^here there are many outputs possible). In data 



path or hybrid archi tectures where there are only a tew 
numerical outputs possible, explicit assignment of output 
values should also be considered (see the blackjack 
algorithm, following). A general rule is to choose the 
method that results in the shorter algorithm, whether (1) 
explicit assignment of outputs, or (2) assignment of outputs 
as a function of either inputs or intermediate values. The 
significance of this is that the designer can influence the 
design by the MacPitts program written, even though the 
silicon compilation process is automatic- 

The two previous algorithms assume serial decoding. 
If it is desired to do the decoding faster, parallel 
decoding should be considered. MacPitts has a mechanism for 
this implicit in the integer data types (which look at a 
data word in parallel), and the multiple PROCESS algoritnm, 
which performs independent functions in parallel. Parallel 
data processing will be considered in Chapter VI. 

The alternate solution (control path logic) to the 
Gray code decoder is shown in Appendix B for comparison to 
gc.mac and gc2.mac. The script and cif files are included 



also for comparison. 






;GRAY CODE to BINARY conversion algorithm 
(program gc2 2 
(def 1 ground) 

( def 2 ph 1 a ) 

( def 3 ph 1 b ) 

( def 4 ph 1 c ) 

(def reset signal Input 5) 

(def Inp port Input (6 7)) 

(def bln port output (8 9)) 

( def 10 power ) 

(process grycod 0 
msbs 



(cond((»0 1np)(setq bln 1np)(go msbs)) 

((» 1 InpXsetq bln InpXgo comp]))) 

comp 1 

(cond((“0 1np)(setq bln (word-not 1np))(go compl)) 

((- 1 1np)(setq bln (word-not 1np)>(go nextbit))) 

nextb It 

(cond((»0 1np>(setq bln 1np)(go nextbit)) 

((■ I 1np)(setq bln 1np)(go compl))) ) ) 



Figure 4.16 Gc2.mac 



clock 




Figure 4.17 Gc2.ci-f 






c. 



A BLACKJACK GAME 



The previous section discussed MacPitts sequential 
logic implementation as a -function o-f algorithmic syntax- 
A simple -finite state machine was developed, and the 
structural r ami -f i cat i ons of the source algorithm v^jere 
investigated- This section will discuss development of a 
more complex algorithm, and its consequent structure. 

1 . The A1 qor i thm 

The blackjack game algorithm was developed based 
on the following rules- The rules are expressed as FSli 
states, since the transition to MacPitts syntax is easier 
that way. The capitalized words correspond to node names 
and MacPitts variables- 



sO: START , initialize 

sl:ACCEPT card (?) (F, go sO) , add FACE value to 

SCORE 

s2:if ace and no prior ace valued as 11, 
SC0RE=SC0RE+10 
s3sif SC0REO16, HIT, go si 

s4sif SC0RE>21 and previous ace valued as 11, 
SC0RE=SC0RE-10n go s5 

s5sif SC0RE<21 and no previous ace valued as 11 n 
BROKE, go si 

s6:if 17<=SC0RE=>21 , STAND, go si 



The next step is to create a state transition 
diagram, and then to translate the game rules into the 
appropriate MacPitts entities (ports, registers, signals, 
and flags) -This is usually done from an English 
description, and then the number of states is minimized by 



standard techni ques- 



Figure 4.18 shows the transition 



diagram, which is not minimized tor the sake of clarity. 
There are seven nodes in the diagram. The top node is 
start, the initial state and the state to which the FSM 
reverts when the reset signal is brought high. The next 
node is draw, where the player draws a card (simulated by 
an o-f-f— chip random number generator). The third node is 
labelled ace, and represents decisions made it an ace is 
drawn. The next node, htchk, checks tor a hit condition 
(draw another card). Following htchk is devalu, which 
decrements the rscore contents when appropriate- Then the 
broke (lose game) condition is tested in the brkchk (broke 
check) state. Finally, the stand check node, stchk, tests 
it the stand (win) condition exists, and the program 
returns to the initial state tor either replay or 
termination. The state transitions tollow trom the 
proceeding rules. The MacPitts driver algorithm is written 
on the basis ot the state transition diagram. The driver 
is shown in Figure 4.19. 

Storage elements are required tor state 
transition decisions under the CONDs , so these variables 
must be tlags (acet Ig and acpttlg)- Line 11 in the source 
code retlects this. The arithmetic comparisons are made on 
integer values, and these must likewise be storage 
elements, so this variable is detined as a register 
(rscore, line 10). Since the FSM progresses asynchr onousl y 
with the output (no new output with each clock cycle), 




Figure 4.18 Blackjack Game State Transitions 
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1 ;B5.MAC BLACKJACK MACHINE 

2 (program blackjack 5 

3 (def 1 ground)(def 2 ph1a)(def 3 phtb>(def 4 phic) 

4 (def face port 1nput( 5678 9>) 

5 (def hit signal output 10)(def stand signal output 

6 (def broke- signal output 12) 

7 (def score port output(13 14 15 16 17>> 

8 (def accept_card signal Input 18) 

9 (def reset signal Input 19) 

10 (def 20 power )(def rscore register) 

11 (def aceflg flag)(def acptflg flag) 

12 (always(setq acptflg accept_car d ) ) 

13 (process play 0 

14 start 

15 ( cond ( a cpt f 1 g ( setq rscore 0)(setq aceflg f))) 

16 draw 

17 ( cond ( a cptf 1 g ( setq rscore('*- rscore face)) 

18 (setq score rscore) (go 

19 (t (go 

20 acenode 

21 (cond((and (* face 1) (not aceflg)) 

22 (setq rscore (+ rscore 10)) 

23 (setq score rscore) 

24 ( setq acef 1 g t ) ) ) 

25 htchk 

26 ( cond( ( uns 1 gned-<» rscore 16)(setq hit t) 

27 devalu 

28 (cond((and aceflg (uns1gned-> rscore 21) ) 

(setq rscore (- rscore 10)) 

(setq aceflg f) 

(setq score rscore) 

( go 

33 brkchk 

34 (cond((and (uns1gned-> rscore 21)(not aceflg)) 

35 (setq broke t) (go 

36 stchk 

37 ( cond ( ( a nd ( uns 1 gned- < * rscore 21) 

( uns 1 gned- >= rscore 17)) 

(setq stand t> (go 



1 1 ) 



acenode ) ) 
sta r t ) ) ) 



29 

30 

31 

32 



38 

39 



(go draw))) 



htchk) ) ) 
start > ) ) 



start > ) ) 
) ) 



Figure 4.19 B5.mac 
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there must also be a port (score, 



line 7) to clock the 



register value to- Similarly, a port (-face, 
de-fined as the input (-Face value) o-f the card- 
output is produced asynchronous! y with the 
latching operat i on 



1 i ne 4 ) is 
Whenever an 
clock, the 



(setq <regi ster > i nteger_val ue) 



must be made- One method ot clocking the register contents 
to the output port is to use the ALWAYS statement under 
the PROGRAM statement - 



(program <name> <data path width> 



(al ways (setq (output_port regi ster _c on tents) ) ) 
(process <name> <stack depth > 



This will insure accurate current output values. In the 
blackjack algorithm, this procedure will not work- It the 
statement 

(always(setq score rscore) ) 

is used, the algorithm would appear to work in the command 
interpreter- Upon compilation, however, the following LISP 
compiler (Liszt) diagnostic results, 

Errors Non-numiber to minus nil 

< 1 > 

V‘jhere the first line of the diagnostic indicates an 



attempted arithmetic operation on an empty LISP atom or 
list, and the second line is the LISP debugger prompt 
[Ret. ll:p- 11-1]- 

The reason why this does not work (for this 
algorithm) is that rscore has not been initialized (as in 
Fortran, for example) at execution of the ALWAYS 
statement. The LISP primitive representing rscore is at 
this time a nil, or empty, atom- The solution is to clock 
the register (rscore) to the output port (score) whenever 
it changes value. Lines 18, 21, and 23 show this other 

method of register transfer to ports. 

There are some new forms in bS.mac which also 
require discussion. The integer test which returns a 
Boolean value to control is 

(<signed> <inequality type> integer 1 integer2) 

where the field <signed> is required, and is either blank 
or the string “unsigned-" for the less than, less than or 
equal, greater than, or greater than or equal tests- The 
comparison is made with the < inequality type> between 
integerl and integer2- 

For instance, if temp is an integer variable set 
equal to 72, hot is an integer variable set to 88, and cold 
is an integer variable set to 60, the following forms would 
signals to control shown. 



;[ 53 



produce the 



The result of the 



FORM 



SIGNAL TO CONTROL 



(cond ( (=hot 88) ) ) T 
(cond ( (unsi gned-< hot 99))) T 
(cond ( (unsi gned-<=hot 89))) T 
(cond((= temp hot))) F 
(cond ( (unsi gned-> temp hot))) F 
(cond ( (unsi gned->= 70 temp))) F 



integer comparison test is a Boolean value, and as suchis 
used as a conditional under COND, as shown in Figure 4.19. 

The remaining -Forms in the algorithm have been 
previously explained. The algorithm b5.mac (which required 
■Five tries to obtain a success-Ful compilation) -follows the 
FSM state transition diagram with the syntax given. The 
algorithm has been exhaustively tested (only possible with 
simple FSMs) in the command interpreter. 

2. The Chip 

Figure 4-20 shows the ci-fplot resulting -from 
bS.mac. The appearance is similar to the Gray code decoder 
layout, with the exception o-f an added -functional block at 
the top right. This is the -flag block, resulting -from line 
11 in bS-mac. The -flag block is both a source and a 
destination -for control signals, as the driver syntax 
suggests. 

The data path is organized in -five parallel 
units, as expected -From line 2 in b5-mac- There are 
seven states in the FSM, so only three o-F the -Five 
instantiated sequencer tails are connected to control (the 
other two are vestigal, i nstant i ated , yet not used). 
Since -four integer values were used in the comparisons. 



1.54 







Figure 4.20 B5.ci+ 
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the data path is required to generate the comparison 



integers. This must be considered in designing an 
algorithm, in assigning the data word length under the 
PROGRAM statement. The maximum score possible tor the 
bl ack jack game is 27, so the minimum word width is -five. 
Another reason -for the lengthened data path is the number 
o-f arithmetic tests made. The integer values tor hit, 
stand, broke, and devalu are made within the data path, 
since syntax specities structure in MacPitts. In the Gray 
code decoder, the comparison tests generate combinational 
logic in the data path which sends a signal to control- As 
more data path tests are required, a longer data path will 
r esul t - 

The Weinberger array o-f the blackjack chip shows 
a multi-level structure similar to the Weinberger array 
-for the Gray code decoder- As the Weinberger array grows 
in complexity, it becomes increasingly di-f-ficult to 
understand its -function in terms of a gate level 
equivalent- The correct by construction property of 
MacPitts is intended to assure correct operation of large 
control path circuits nonetheless- The compilation session 
recording in Appendix B shows the MacPitts i nstant i ati on 
process for the blackjack machine, which follows the same 
general scheme as for the Gray code decoder- 



D 



MEAD-CONWAY TRAFFIC LIGHT CONTROLLER 



The -functional description o-f the Mead— Conway tra-f-fic 
light controller is taken -from CRe-f . 4s p . 85D . The chip 
controls a tra-f-fic light at a highway-farm road 
i ntersect i on . 

1 . The A1 qor i thm 

Design of the algorithm follows principles stated 
previously. After the desired function is understood, an 
automata (state diagram) is drawn. From this, the 

algorithm is written. The placement the logic is 

determined by synta!-: , and the selection of storage 
entities (flags or registers) follows. 

The light controller controls the three-light 
traffic signals at the intersection of a busy highway and 
a less busy farmroad. The input signals are C (car on the 
farmroad) , TL (long timeout), and TS (short timeout). The 
outputs are ST (start timer) , FLO and FLl (encode the 
color of the farmroad light), and HLO and HLl (encode the 
highway light color). An FSM is appropriate to represent 
the sequential nature of the traffic light cycling. Figure 
4.21 shows the state transition diagram, with labels 
correspond! ng to the MacPitts states in the algorithm. 

Next, the algorithm is written. A control path 
architecture is chosen for ease in setting the output bits 
(initially, the output bits are set individually). Storage 
elements (flags) are not needed for this example, since 



Figure 4.21 Light Controller State Transition Diagram 
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the outputs are synchronously produced, and constant 
throughout a given state. In control path circuits using 
Boolean variables, the value goes to FALSE at the next 
state transition unless it is explicitly set to TRUE. So 
storage of the output values would be required if they 
were to be output within a different state from that in 
which they are determined. For example, if the light 
control signals for the highway yellow (HY) state were 
produced in the previous state <HG) , then they would 
require latching so the correct values would remain after 
the state transition. If the chip was to be produced, 
however, the outputs would require latching as explained 
in the previous section, since the chip clock is many 
times faster than the light timer clock. 

The output bits which control the farmroad and 
highway light colors must be encoded. The following table 
is used 



HLO 

0 

0 

1 



HLl 

0 

1 

1 



FLO FLl 

O 0 GREEN 

0 1 YELLOW 

1 1 RED 



and the output bits are explicitly set to Boolean values 
in the SETQ forms. 

Figure 4.22 is the MacPitts algorithm to create 
the traffic light controller. The format is similar to the 
previous FSM drivers, with the exception of absence of 
data path combinati onal logic. The data path width must be 



;MEAD-CONWAY LIGHT CONTROLER 

;Set the D.P. width to 2 <4 nodes in FSM dgm> 
{program lc2 2 
( def 1 3 power ) 

{def 1 ground) 

(def 2 phia) 

(def 3 phib) 

( def 4 ph 1 c ) 

;The following 3 SIGNALS are control inputs: 

(def c signal input 5) 

(def tl signal input 6) 

(def ts signal input 7) 

;The RESET signal is required for all FSMs: 

(def reset signal input 14) 

;Define 5 output SIGNALS (®> C. P.) to 
;Control the TIMER & HW/FR traffic light: 

(def st signal output 8) 

(def hl-0 signal output 9) 

(def hll signal output 10) 

(def fl0 signal output 11) 

(def fl signal output 12) 

(def fll signal output 12) 

;The PROCESS statement implies FSM sequencing, 

;The stack depth is zero: 

(process 1 ight_cont ro 1 1 er 0 

;The HIGHWAY GREEN state; output=f(PS & PI) 

;where <hg>=PS, and <C , TL , TS >=P I : 
hg 

( cond ( ( not ( and c tl ) ) 

(setq hl0 f) 

( setq hll f ) 

( setq f 1 0 t ) 

(setq fll f) 

( setq st f ) 

(go hg ) ) 

( t ( setq h 10 f ) 

( setq hll f ) 

(setq fl0 t) 

(setq fll f ) 

(setq s t t ) 

(go hy > ) ) 

;The HIGHWAY YELLOW state and associated 

;outputs & state transitions (<go >) 

;tsee text for output encoding table and 
•.explanation of state transition syntax! : 
hy 

( cond ( ( not ts > 

(setq hl0 f) 

( setq hll t ) 

(setq f 1 0 t ) 



Figure 4.22 Lc2.mac 
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( t 



( setq f 1 1 f ) 

( setq St f > 

(go hy ) ) 
( setq h 1 J0T f ) 

( setq h 1 1 t > 

(setq f^0 t) 

( setq f 1 1 f > 

(setq st t) 

( go f g > ) 



;The FARMROAD GREEN state and associated 

;outputs & state transitions? 

^9 

( cond ( ( not ( or tl(not c>)) 

( setq h 1 0 t ) 

( setq h 1 1 f > 

(setq fl0 f> 

(setq fll f> 

( setq s t f ) 

(go f g ) ) 

( t { setq h 10 t > 

( setq h 1 1 f > 

(setq f 10 f ) 

(setq fll f ) 

( setq s t t > 

(go f y ) ) 

;The FARMROAD YELLOW state: 

f y 

( cond ( ( not t s ) 

( setq 
{ setq 
( setq 
( setq 
( setq 

( t ( setq 

( setq 
( setq 
( setq 
( setq 



hl0 t) 
hi 1 f > 
f 10 f > 
fll 

St 



hl0 t> 
hi 1 f ) 
fl0 f) 
fll t > 
st 



t ) 
f ) 

(go f y > ) 



t ) 

(go hg ) > 



Figure 4.22 Lc2.mac (continued) 
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nevertheless declared with the PROGRAM statement- The data 



path width is two , to permit i nstant i ati on o-f two 
sequencers to cycle through the -four states o-f the FSM. 
The initial attempt at lc2 erroneously used a data path 
width o-f -five , and the algorithm compiled to ci-f- The 
resulting cifplot had a data path width of five bits, 

only two of which were connected to the sequencer tails 
to remember and address the states- The other three data 
path units took up chip space, but performed no function- 
2- The Ch i p 

The cifplot resulting from lc2-mac is shown in 
Figure 4-23, and the script of the compilation session is 
in Appendix B. The cifplot resembles the previous tv'io 
FSM cif plots, but lacks flags and data path logic- The 
only registers shown are those which receive and store 
state information from the sequencer tail- As usual, they 
lie in the data path above the clock drivers. Other than 
that, the cifplot for lc2 has no data path- This is 

expected in view of the driver algorithm, and the script 
file of the compilation shows only six data path 
organelles but 43 columns in control - A handcrafted 
version of this chip could be produced with just a data, 
path, if a two phase clock is used- This will be? 



considered in the next chapter- 




Figure 4.23 Lc2.ci-f 
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E. 



SUMMARY 



This Chapter has considered three examples o-f MacPitts 
sequential logic: the Gray code decoder, the blackjack 

game, and the Mead-Conway light controller- In each case, 
the Mealy FSM convention o-f MacPitts led to an easy 
transition -from state diagram to algorithmic description- 
The Mealy architecture is evident in both the MacPitts 
algorithm and the resulting chip layout- 

In the algorithm, each state is given a name (e- g«, 

HIGHWAY GREEN, HIGHWAY YELLOW) and within each state the 
outputs are determined with the COND form and set 

accordingly- The output is a function of both present state 
and present input (e- g- , CARS, T 1ME0UT__LDNG , 

TIMEOUT SHORT) - 



The same Mealy logic is evident in the circuit layout 
(cifplot)- The sequencer stores the present state, and 
multiplexers driven by the Weinberger array and present 
inputs determine the next-state tr ansi t i on i nq by 
controlling the inputs to the bank of present-state 
regi sters- 

Sequential logic in MacPitts can be influenced by the 
designer in the same way as comb i nat i anal logic can, by 
explicitly specifying the desired outputs. The alternative 
IS to specify the outputs as an implicit function of 
either inputs (ports, input signals) or i n ter med i ^ate 
results (internal signals, flags, registers)- In general , 



when the explicit speci -f i cat i on o-F outputs is used 
(setq score 19) 

rather than the -Functional speci t i cat i on o-F outputs 
(setq score (+ rscore -Face) ) 

a smaller and -Faster circuit will result. The explicit 
speci -F i cat i on o-F outputs is there-Fore the pre-Ferred 
method, though not always possible- I-F there are many 
possible outputs, it may even be better to use the 
-Functional spec i t i cat i on o+ outputs rather than atte^rnpting 
to specify each one explicitly- 

The data path width for a MacPitts sequential ma.chine^ 
as specified in the PROGRAM statement, must be large 
enough to address the number of states. That is, the data 
path width must be greater than or equal to log (base 2) 
of the number of states in the state transition diagram. 
If this condition is not met, MacPitts (the silicon 
compiler) will not successfully compile the source 
algorithm- The reason for this requirement is the ma.nne5r 
in which MacPitts lays out the sequencer and data path. 
The sequencer and data path are laid out contiguously, in 
a linear bit-slice conf i gur at i on . The width of both is the 
width of the data path as specified in the PROGRAM 
statement (this number is also the number of present-state? 
registers i nstant i at ed ) - Since there must be the? same 



number of i/o ports as the data path width, and since all 



o-f these ports may not be used -for data i/o, 



one solution 



to the problem ot extra ports is to ground them in the 
circuit . in which the chip is to be used (as suggested for 
the Gray code decoder, where only one port was necessary, 
but two ports had to be specified to allow enough state 
transi ti ons) . The alternate solution for the Gray code to 
binary conversion routine is to treat the data as a serial 
stream, one bit wide. This suggests using SIGNALS (instead 
of PORTs) as inputs, and processing the Gray code as Boolean 
data instead of integer data. This algorithm is included for 
completeness in Appendix B, with the resulting cifplot and 
script of the compilation process. 

MacPitts provides a convenient method to compare both 
Boolean and integer values, which is particularly useful 
in the decision-making under a COND. The Boolean 
comparisons (Figure 4.22) are used to test the value of a 
flag or a signal, and the integer compairisons (Figure 
4.19) are used to compare numerical values in ports or 
registers. In each case, the result is a Boolean signal to 
control VAihich affects subsequent state tr ansi 1 1 on i nq or 
setting of outputs. 

Algorithm design for MacPitts FSMs begins v*Jith rhe 
decision of how much data it is desired to process 
simultaneously, and in what form that data presents itself 
to the chip. P'or instance, if a serial F8M chip is desired 
(e. g. , a serial Gray code decoder) , the data word is one? 



bit wi de. 



The inclination is therefore to treat the data as 



Boolean type, which is feasible for FSM ar chi t ectur es for 
reasons explained previously. The designer is not 
constrained to integer data types in this case (although the 
examples presented in Figure 4.2 and Figure 4.16 used 
integer data types). If the data comes to the chip for 
parallel processing in an n-bit word, however, the 
inclination is to treat the data as integer type (for 
example, the blackjack algorithm). This is not always 
possible, for reasons to be explained in connection with 
Hamming error correction in Chapter VI (HacPitts does not 
permit implicit setting of bits within a data word). 

Algorithm design may be viewed as the designer's 
influencing of the chip layout. Since circuit structure is a 
function of syntax (on a lower level), it is reasonable to 
assume that chip layout is a function of algorithm structure 
(on a higher level). That is, syntax determines not only the 
individual circuit elements (NANDs, ORs , XQRs, ports, flags, 
registers, etc.) of the chip, but also determines how the 
individual elements work in concert. The source ai goritnm 
lc2.mac shown in Figure 4.22 used Boolean control signals as 
inputs (C, TL , TS) . The resulting cifplot in Figure 4,23 
shows a Weinberger array at the bottom, and no data path 
except for a bank of two sequencer organelles at the top of 
the chip. This chip can be viewed as a control path chip. An 
alternate design would use a five-bit word (represent i ng the 



I 



signals HLO, 



HLl, FLO, 



FLl, and ST) as the output, and 



retain the three control signals as inputs. Appendix B shows 
dplc2.mac (the data path equivalent o-f Figure 4.21, 
lc2-fnac), and the resulting ci-fplot. The output bits are set 
explicitly by setting the output word values in the .mac 
■file. This results in a larger data path, as expected, since 
the output decisions result in data path operations instead 
o*f control path operations. The control path is smaller than 
in lc2.ci-f, since the Weinberger array has -fewer decisions 
to make. Appendix B also contains the script -file of the 
compilation o-f dplc2.mac. 

Yet another version o-f the light controller v-jouid assign 
the input values to a three bit word ( r epr esent i ng C, TS, 
and TL) , and make the conditional checks on the input 
control word with the BIT statement- This solution would 
result in a still larger data path and a smaller control 
path than the two previous light controller chips- Just as 
in any high-level language, there exists many ways ot 
solving a given problem with hacPitts- The best way to solve 
the problem must consider not only the algorithm, but the 
structural (layout) consequences of algorithmic syntax. The 
“best" solution is arrived at by experience in MacPitts 
programming, knowledge o-f the consequences o-f syntax, and 
finally, iteration toward a better solution (trial and 



error ) - 



V- 



A COMPARISON OF A MACPITT5 DESIGN 
NITH A HANDCRAFTED EQUIVALENT 

Previous chapters illustrated some i ne-f -f i ci enci es 
inherent in the MacPitts layout scheme- The Weinberger array 
and the data path both use transverse polysilicon wires for 
cross-communi cat i on , and poly has the highest specific 
resistance of the three possible NMOS v^jire materials- The 
one dimensional river routing method used is not optimal 
because the input, output, and data/control lines required 
are long- The sequencer organelles are instantiated 
according to the data path width, and not according to the 
number of states necessary- The Weeinberger array generates 
multiple cascaded gates to implement multiple output 
combi nat i onai logic functions, causing long signal delavs in 
comparison to a PLA- A handcrafted version of a functionally 
equivalent chip is compared to a MacPitts design to 
investigate these differences both quant i tat i veH. y and 
qual i tat i vel y - 

A- THE HANDCRAFTED TRAFFIC LIGHT CONTROLLER 

The standard for this comparison is a handcrafted (CAD) 
version of the Mead-Conway traffic light controller wnich is 
compared to the MacPitts generated version in terms of speed 
and power consumption- Qualitative obser vat i ons are also 



descr i bed - 



The custom-made traffic light controller was constructed 



on the Caesar VLSI graphics editor with the aid of various 
VLSI CAD tools. 

1 - Desi qn 

The MacPi t t s-pr oduced traffic light controller was 
described in the Chapter IV. MacPitts design is just a 
matter of generating a prototype MacPitts driver program, 
and refining it until an acceptable archetype algorithm is 
achieved. This is done in both the command interpreter 
(algorithmic optimization), and in Caesar (structural 
optimization). Caesar allows the designer to see the 
structure and analyze it with power estimators (Powest) and 
timing estimators (Crystal , SPICE). Moving pads and deleting 
vestigal structures are examples of possible structural 
optimizations using Caesar (this procedure should be 

considered if the MacPitts chip is to be fabricated). 

The standard VLSI design scheme is similar to 
MacPitts design in that structure is considered as a 
function of behavior. The behavior is not constrained to 
follow a given algorithmic syntax , though, as it is in 
MacPitts. So custom design is more flexible than silicon 
compiler designs are, since the designer can choose any 
desired structure to implement the behavior called for. 

The standard NMOS PLA is used for the hand-crafted 
light controller. Mead and Conway LRef. 4:pp„S0-B8J 
develop the state transition table for the light controller. 



and provide a sticks diagram of the clocked PLA hSM. 



The 



•following PLA is based on the Mead-Conway development. 

□usterhaut CRef. 9] illustrates use o-f Eqntott and 
Reference 10 illustrates use of Tpla to generate this PLA» 
Eqntott is a VLSI CAD program which takes logic equations as 
the input and produces a PLA truth table as the output. This 
truth table is the input to Tpla (Technology independent 
Programmed Logic Array) , and Tpla further allows the 
designer to geometr i cal 1 y modify the PLA. The result of Tpla 
processing the truth table is a Caesar r epr esentat i on of the 
desired PLA. Figure 5-1 shows the input logic equations for 
Eqntott, and Figure 5-2 shows the resulting truth table from 
Eqntott - 

The best method to design a PLA is to create the 
logic equations as in Figure 5.1, and then use the Unix 
pipeline to send the result of Eqntott to Tpla 

eqntott CoptionsJ infilename I tpla lloptionsll 

outf i 1 ename 

The result is a Caesar file of the PLA layout, which must be 
converted to cif in Caesar as previously described- Figure 
5.3 shows the -trans PLA (inputs and outputs on opposite 
sides) generated from the command 

eqntott -1 -f -R stopltltpla -s Btrans -1 -0 -o 
stopl t . ca 

which took 28 seconds to complete- The eqntott swi tch -1 
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Figure 5.1 Stoplight Logic Equations -for Eqntott 
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Figure 5-2 Truth Table Input -for Tpla 
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means list the truth table, -t means to connect the -feedback 
paths in the PLA, and -R directs eqntott to minimize the 
truth table. The tpla switch ~s selects the PLA type (-trans 
shown), and -I and -0 indicate clocked inputs and outputs. 
This command string creates an NMQS FStI Caesar tile. It was 
determined later that a -cis PLA (input and output on the 
same side ot the PLA) would tit the chip trame better. The 
change is simple. The same command string as above was 
issued, except Bcis was substituted tor Btrans. 

The PLA is a tast structure. Appendix A shows the 
interactive Crystal session showing the timing analysis of 
just the PLA* The delays are determined to be 26-93 ns tor 
phi a and 32-06 ns tor phib- For symmetric phi a and phib 
durations, with each having the duration ot the slowest 
critical path, or 32.06 ns, the maximum clock rate is 15.6 
ns. The maximum clock rate is calculated as the inverse ot 
tv-nce the slowest critical path time- The use ot Crystal on 
non-over 1 appi ng , two-phase clocking schemes is describe?d in 
[Ret - 3: pp . 80-93 3 . 

The sequential logic tor the light controller chip 
is made v^^ith the University ot Washi ngton/Northwest 
Consortium CAD tools as described above. All that is lac king 
is the power and ground connections and the pads- Usual, ly 
the power and ground busses are laid out by hand (Caesar) or 
specitied in cartesian coordinates (CLL, Chip Layout 
Language, a method ot sped tying mask polygons, their 




Figure 5-3 —Trans PLA Resulting -from Eqntott and Tpla 
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dimensions, and the “fabrication process required), and the 
pads are then invoked from an existing library of VLSI macro 
cells. MacPitts can shorten the design time by doing most of 
this work for the designer. Figure 5.4 shows the algorithm 
stopl t_f rm. mac used to create the frame for the PLA FSM- The 
frame is created like wire. mac (Figure 2-1), in that it is 
just wires from input to output. The wires are deleted in 
Caesar , and the PLA is placed in the center of the chip 
frame- Figure 5.5 shows the resulting chip. The clocked 
-cis PLA is in the center of the chip, connected to 
appropriate inputs and outputs (tpla makes this connection 
easy, it labels all inputs and outputs). The third clock pad 
(phic) is deleted in Caesar- This chip still has long 
indirect metal runs and lots of white space. 

2 . Opt i mi zat i on and Anal ysi s 

Figure 5.6 shows a condensed version of the chip, 
stopl t_mi nc . ci f . The area of the chip shown in Figure 5-6 is 
407. smaller than the chip in Figure 5-5, and still more 
reduction is possible- Since there are 12 pads, it would be 
better to place three per edge on the chip. The signal v*jires 
could also be shortened by judicious choice of pad placement 
in the -mac algorithm- And finally, all sides could be 
brought closer together. There exists a synergistic 
relationship between the existing CAD tools and hacPitts 
that bears further study. 



; stop 1 t_f r m . mac 

;Th1s pgm creates a design frame for the stoplight 
;controller Ccf.Mead & Conway, p.81, 2nd printing] 

; hand-craft I ng will be required to merge the PLA 
;FSM created by eqntottitpla Into this frame. CAESAR 
;ls used to do this. 

(program stop 1 t_frm . mac 5 
( def 13 power ) 

( def 1 ground ) 

( def 2 ph I a ) 

( def 3 ph lb ) 

( def 4 ph I c ) 

; Inputs to light controller PLA FSM 
(def c signal Input 5) 

(def tl signal Input 6) 

(def ts signal Input 7) 

;outputs from light controller PLA FSM 
(def st signal output 8) 

(def hl0 signal output 9) 

(def hll signal output \ 0 ) 

(def fl0 signal output 11) 

(def fll signal output 12) 

( a 1 ways 
♦ 

;here we setq 5 simple dummy paths. These are chosen with a 
; V I ew towards later simple editing fn CAESAR 

♦ 

( setq St c ) 

( setq h 1 0 t 1 ) 

( setq hll ts ) 

( setq f 1 0 c ) 

( setq f 1 1 tl ) ) ) 



Figure 5.4 



Stopl t_f rm. mac 



176 












Figure 5.5 Stopl t_chp . ci -f 
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Figure 5.6 Stopl t_mi nc . ci f 
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Intervention by the designer, 



however , 



is antithetical to 



the goal ot silicon compilation- The silicon compiler has a 
ruleset which (in theory) guarantees the property of 
"correct by construct i pn " - This property states that the 
chip design will always be functionally correct; it cannot 
be wrong. Circuit density is not the primary goal, nor is 
speed « 

The MacPitts designer has no control over circuit 
density, other than Boolean optimization of the algorithmic 
forms as explained in Chapters II and III- The designer does 
have some control over chip speed. There are two ways of 
optimizing throughput in a MacPitts design- The first method 
is explained at the beginning of Chapter III, and can be 
thought of as algorithmic optimization. The objective is to 
v^^rite an algorithm which executes in a minimum number of 
clock cycles. The verification is done in the command 
interpreter- PAR, CQND , and PROCESS are used wherever 
possible to parallel opersStions- 

The second method of controlling chip speed is 
through circuit optimization (this too is a function of 
syntax in MacPitts). The designer chooses either the data 
path or the control path or a hybrid of both, and with 
Crystal designs a chip which has a maximum speed per clock 
cycle- The throughput is then the product of the inverse of 
the number of clock cycles required for a valid result and 
the cycle rate (r esul ts/cyc 1 e x Hz = resul t s/sec )- 



/c> 



Furthermore, the circuit speed can be increased by 
judicious placement o-f pads in the -mac -File- It is not 
always apparent where the routing will go be-forehand, so the 
recommended method is to create a prototype ci-fplot, and 
then modi-fy the pad numbering in the -mac -file to decrease 
signal path lengths -from the pads to the logic elements- For 
example, in stop 1 t__mi nc - c i -f (Figure 5-6), the phia pad would 
be moved to center le-ft on the chip -Frame, phib to center 
right, ground to top right, and C, TL , and TS would be mossed 
to the lower le-Ft corner region to decrease metal run 
lengths- All o-f these suggested moves are not possible due 
to the way hacPitts places pads, so Caesar editing is 
required to optimize the hacPitts design it minimal length 
runs are desired- 

Appendix C contains the Crystal analysis o-F the PLA 
trattic light controller- The chip speed is limited to the 
inverse o-f the sum o-f the critical propagation times, or 
6-85 MHz- This is less than halt the speed ot just the PLA 
(16-95 MHz)- Appendix C also contains the Powest analysis ot 
the PLA trat-fic light controller- 

B. COMPARISON WITH MACPITTS DESIGN 

Appendix C contains the Crystal command tile tor the 
MacPitts tra-ftic light controller timing analysis- Froede 
CRet- 3:pp- 80-8511 explains the analysis ot a MacPitts 

The Crystal command tile in Appendix C 



design with Crystal - 



shows just the commands issued to Crystal , and in 
parentheses to the right, the time delay values returned 
(represent i ng an actual Crystal session). 

Figure 4,23 shows the chip on which this Crystal 
analysis was done. The critical path is -from phic to the 
clock drivers to the state registers. The clock drivers 
induce a cumulative delay o-f 23,9 ns, and the state 
registers a cumulative delay o-f 114,2 ns, so the transition 
induces a delay of 90,3 ns. The Weinberger array induces 
another 173 ns, and the slowest path is -from there to the 3T 
pad. The total delay is 363,52 ns, -for a maxi mum speed of 
2,75 Mhz , This speed is 407, o-f the maximum speed of the PLA 
light controller. 

Figures 5,7 and 5,8 show the -floorplans of each version 
o-f the traf-fic light controller. Figure 5,7, the PLA FSM is 
comparat i vel y simple. The FSM is a small clocked PLA with 
feedback. The connections to the pads are all metal ^not 
shown). Figure 5,8 is the MacPitts version, and is -far more 
complicated. The control path is large, and induces the 
largest part o-f the delay. The present state (PS in Figure 
5,8) -next state mechanism is much more complex than the 
simple PLA -feedback generated by eqntott and tpia. The wires 
between the data and control paths are poly, as are the PS 
feedback lines in Figure 5,8, These wires contribute to thEi- 
slov^jness o-f the MacPitts chip. The wires to the pads also 
take a more circuitous route, inducing still more delay. 
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Figure 5.7 PLA Stoplight Chip Floorplan 
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Figure 5.8 MaePitts Stoplight Chip Floorplan 



Table 5.1 compares the MacPitts tra-f-fic light controller and 



the PLA tra-f-Fic light controller. 



TABLE 5. 1 

MACPITTS vs. HANDCRAFTING 



PLA 


Ch i p 


MacPitts Chip 


del ay 
Cns ] 


146.98 


50 1 - 97 


max, clock -Freq- 
CMhz ] 


6.85 


1 . 99 


pul 1 up tr ansi stors 


5 


87 


avg, DC power CWH 


. 042 


„ 055 


max • DC power CW] 


. 085 


. 107 


control path dimensions 
CmmU 


. 49 . 29 


.547 K .185 


data path dimensions 
CmmU 


.178;-! .1 73 


. 256 !•! . 240 


area ratio Ccp/dp] 


. 046 


8.9 


chip size 
C mm**2 H 


. 836 


1 . 1 64 



VI. DESIGN EXAMPLE; HANNING ERROR DETECTOR/CQRRECTQR 



This Chapter describes one method o-f design with 
MacPitts. The procedure is to -first de-fine the problem, then 
to write an initial algorithmic description o-f the solution 
in MacPitts (the language). The initial algorithm is either 
a simpli-fied version, or a piece o-f the larger problem. The 
simpli-fied algorithm is tested -for execution in the 

interpreter, and then compiled to ci-f. Alternate solutions 
are considered next, and simpli-fied alternate solutions are 
likewise tested. The best o-f these algorithms is then 
chosen, based on speed, power dissipation, and size. The 
chosen solution is then expanded to solve the larger 
problem. 

The problem is to design a parallel Hamming method error- 
detector /corrector which will correct single bit errors in a 
15-bit encoded message. 

A. THE ERROR DETECTOR 

The theory behind Hamming error detection and correction 
is -found in most texts on coding and in-formation theory 
CRe-f. 5:pp. 39-49D. A subset o-f this problem is error 

detection, which the prototype algorithm solves. 

The prototype algorithm looks at a three bit encoded 
message in parallel, and by the Hamming method determines 



the bit error location. 



The algorithm is written to 



demonstrate correct operation for three-bit messages- It can 
later be expanded to cover longer word lengths. 

The Hamming method scans the encoded word, and by a 
series of parity checks determines the bit error position- 
The single error detection method assigns the result of each 
parity check to a bit of data- The word formed from the 
resulting bits comprises the syndrome- The value of the 
syndrome is the bit error position in the received message- 
The parity checking is done in a specific order. If the 
codeword is a string of n bits with the Isb leading 

0 1 2 3 4 5 6 7 8 ... n 

then the syndrome bits are determined by parity checks 
across the message bits as shown below. 

syndrome bit message bit positions for parity check 

0 024681012141618 20 . . - 

1 12569 10 13 14 17 18 21 22 ... 

2 3 4 5. 6 11 12 13 14 19 20 21 22 ... 

3 7 8 9 10 11 12 13 14 23 24 25 26 27... 

Where the syndrome word is read from msb to Isb and points 
to the message bit which needs correcting. 

For instance, for an encoded seven bit message, there 
are three check bits (represented by "c"), and four bits of 
information (represented by "i‘') in the positions indicated 






bel ow 



0 1 2 3 4 5 6 
c c i c i i i 

The -First bit o-f the syndrome (1 sb ) is determined by parity 
checks over positions 0, 2, 4, and 6- The next bit ot the 
syndrome considers positions 1, 2, 5, and 6. The last bit of 
the syndrome (msb) is determined -From message positions 3, 
4, 5, and 6. The three-bit syndrome indicates the error 
position in the message string- I-F the received message is 
0100011, the syndrome generated is Oil- The syndrome 
indicates an error in the third bit -From the right- The 
correct message is 0110011- The Hamming method corrects 
(complements) the third symbol. 

1 - Desi qn Con si derations 

Previously in this research it was noted that 
MacPitts syntax does not permit explicit bit manipulation in 
the data path. To do this algorithm in the data path may be 
desirable, in view o-f the speed o-F simple data path 
functions. Since this is not possible, perhaps a hybrid data 
path-control path algorithm should be considered- A review 
o-f the Gray code decoder chip (Figure 4.2) will show why 
this is not a good approach. The Gray code decoder is a 
mixed structure, having both a data path and a control path. 
The interconnections are all poly, which slows the chip 
down. The multiple unPARalelled CONDs have a more 
detrimental e-F-Fect on speed, since each requires a clock 
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cycle to execute it its antecedant is true- So the target 
architecture will be Boolean (control path)- 

The parity checks can be done by a variety ot 
methods in MacPitts. The simplest way is with the built in 
library -function PARITY, which has the -format 

parity (boolean boolean 

PARITY per-forms modulo two addition, and returns Boolean 
TRUE to control i -f the argument is an odd number o-f TRUEs , 
or Boolean FALSE i-f the argument is an even number of TRUEs. 
So the parity checks can be done directly on the bits of the 
message, in parallel, with the PARITY statement- 

MacPitts also has a method of checking specific 
bits in a data word- The BIT statement looks at a bit in the 
i nteger-val Lied word, and returns a TRUE to control if the 
bit is one, or a FALSE to control if the bit is zero- The 
form of the BIT statement is 

(bit <bi t_posi t i on > < i nteger_expr essi on > ) 

Figure 6-1 is the algorithm tst-mac, used to test the BIT 
statement- It is similar functionally to wire. mac, in that 
it sets an output bit to an input bit- The difference is 
that BIT permits a bit-by-bit conversion from integer value 
to Boolean value- In Figure 6-1, the input word mesq is 
integer valued- The output bits are Boolean signals (outx), 



and they are setq'd to the respective bit position 
mesg (the corrupted input word) - 
2. Prototype Error Pet ector 

Knowing Hamming error detection theory 
PARITY and BIT statement syntax, an error detector 



values of 



and the 
al gor i thm 



;TST.MAC 

;A MacPItts algorithm 
;The BIT form Is used 
;Input data word, and 
;The value of the bit 



for bit-setting of output ports 
to select a specific bit of the 
an output signal is set to 
se 1 acted . 



;Requfre a D.P. width of three to accommodate the inputs 
(program tst 3 



(def I ground) 

( def 2 ph 1 a ) 

{ def 3 ph i b ) 

( def 4 ph i c ) 

;Use a 3-bit INTEGER as input PORT: 
(def mesg port input (567 )) 

;Use 3 BOOLEAN SIGNALS as outputs: 
(def out0 signal output 8> 

(def outl signal output 9) 

(def out2 signal output 10) 

( def I I power ) 



;Perform bit-setting on each clock cycles 
( a 1 ways 



;Select which bit of the input word is to 
;Be SETQ’d to the output signal pads: 

( setq out0 (bit 0 mesg)) 

(setq outl (bit I mesg)) 

(setq out2 (bit 2 mesg)) ) ) 



Figure 



6.1 Tst . mac 



can be written. The encoded message input <mesg) is word- 
valued, three bits wide. The output syndrome (syndx) is two 
Boolean signals. The algorithm is shown in Figure 6.2. The 
semantics o-f the MacPitts algorithm -follow the English 
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description o-f the problem statement. The appropriate bit 
patterns o-f the message are checked, and the syndrome bits 
are set based on the results o-f the parity checks. This 
algorithm was exhaustively tested in the command 
interpreter, and serves as the prototype tor the error 



JHAM3.MAC 

;A MacPftts algorithm for sfngle-orror detection 
; using the Hamming method. 

(program haml 3 ;note width of data path (*width of msg > 

(def 1 ground) 

( def 2 ph i a > 

( def 3 ph i b ) 

(def 4 ph i c ) 

;mesg is the input data word of 3 bits width with possible errors 
(def mesg port input (567 )) 

(def syndl signal output 8) 

(def synd2 signal output 9) 

( def 10 power > 

( a 1 way s 



;For a 3 bit word, two parity checks are required. The 
;result of these parity checks is a 2 bit syndrome, which 
; indicates the bit position of the error in the 3 bit word. 



;this cond sets or 
( cond 

((parity (bit 
( setq synd 1 t 
( t 

( setq synd 1 f 



clears the 1 sb 

0 mesg ) (bit 2 
> ) 

) ) ) 



of the syndrome, 
mesg) ) 



;This cond sets or 
( cond ((parity (bit 
(setq synd2 t 
( t 

(setq synd2 f 



clears the msb 
1 mesg ) ( b i t 2 
) ) 

) ) ) ) ) 



of the syndrome, 
mesg) ) 



Figure 6.2 Ham3.mac 



detector. The algorithm compiled to ci-f, and Figure 6-3 
shows a logic structure completely in the control path. The 
parallel lines at center le-ft are the input (mesg) bits, and 
result -from the BIT statement. They go to the right side o-f 
the Weinberger array, where they -fan out to multiple NOR 
gate inputs. 
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Figure 6.3 Ham3.ci-f 
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Expanded Prototype 



The three bit Hamming error detector is the trivial 
case. The decision is in -favor o-f the winning bits ("two out 
o-f three"), so the syndrome is not really necessary unless 
the check bits are wrong (a possibility -for which the 
Hamming code allows ). 

The Hamming code is uni-form in its protection, 
however; once encoded there is no di-f-ference between the 
message bits (i) and the check bits (c). This is important 
in checking longer words -for errors- A seven bit message is 
checked as in the example given above. Elaborating on the 
prototype. Figure 6.4 shows the algorithm to generate the 
syndrome -for a seven bit parallel error detector- This error 
detector requires a three bit syndrome to point at one o-f 
the possible seven error bits in the message. Section A. 
above illustrates the syndrome generation process, and how 
the syndrome word points at the erroneous message bit. The 
resulting ci-fplot is shown in Figure 6-5, and the structure 
is similar to the Weinberger array -for the three-bit error 
detector - 

It is good practice to expand the algorithm in 
steps, instead o-f going directly -from the prototype to the 
-final design- Unexpected results can be dealt with better i -f 
this approach is -followed- 



;HAM7,MAC 

;A MacPftts algorithm to Implement a 7 bit message error 
;correct1on chip. The Hamming method is used. Four of the 
;7 bits are data bits, 3 of the 7 are parity check positions. 

(program ham7 1 
(def 1 ground) 

( def 2 ph i a ) 

(def 3 phlb) 

(def 4 ph 1 c ) 

(def msg port Input (5 6 7 8 9 10 11)) 

(def syndl signal output 12) 

(def synd2 signal output 13) 

(def synd3 signal output 14) 

( def 1 5 power ) 

;The Hamming method uses parity checks over bit positions 
;l,3,5,and 7 to set the Isb of the syndrome, 

;checks over positions 2, 3, 6, and 7 to set the middle synd bit, 
;and checks over positions 4,5,6, and 7 to set the msb of the 
;syndrome. The value of the syndrome Indicates the bit error 
;posltlon in the 7 bit message. 

(always 

; set Isb of syndrome: 

{ cond 

({parity (bit 0 msg) (bit 2 msg) (bit 4 msg) (bit 6 msg)) 
( setq synd It)) 

( t 

( setq syndl f ) ) ) 

;set middle bit of syndrome: 

( cond ( { pa r i t y (bit 1 msg) (bit 2 msg) (bit 5 msg) (bit 6 msg)) 
( setq synd2 t ) ) 

( t 

( setq syndZ f ) ) ) 

;set msb of syndrome: 

{ cond (( par i ty (bit 3 msg) (bit 4 msg) (bit 5 msg) (bit 6 msg)) 
{ setq synd3 t ) ) 

(t 

( setq synd3 f ) ) ) ) ) 



Figure 6.4 Ham7.mac 




Figure 6.5 



Ham7 . ci *f 
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4. 



Error Detector 



The desired algorithm is to uni-formly detect errors 
in a 15 bit message. Remembering the surprising inability o-f 
MacPitts to compile a six input/one output gate in the data 
path, a test algorithm was written -for the larger message. 
Figure 6.6 is the algorithm to detect errors in an 15 bit 
encoded message. The syndrome bits are determined from the 
parity checks as -follows. 



syndrome 
synd 1 
synd2 
synd3 
synd4 



message bit check positions 
02468 10 12 14 



1 2 5 6 9 10 13 14 
3456 11 12 13 14 
789 10 11 12 13 14 



The single error detection scheme requires -four 
bits to select the message bit -for correction, thus the -four 
bit syndrome. Syndl is the 1 sb and synd4 is the msb o-f the 
Boolean syndrome word. Figure 6.7 shows the ci-fplot 
resulting -from haml5.mac. The structure is predictably 
similar to ham7.ci-f and ham3.ci-f <Figure 6.3, Figure 6.5). 
This algorithm serves as the archetype <chie-f model, as 
opposed to prototype, -first model) -for the error detector. 
The error detector is hal-f o-f the solution, the other halt 
is correction ot the errors. The detection is teasible, as 
proven by this algorithm. 

Table 6.1 shows a comparison between the three 
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error detectors. 



;HAM15.MAC 

;A MacPitts algorithm to Implement an 11 bit message error 

;correct1on chip. The Hamming method Is used. 11 of the 

;15 bits are data bits, 4 of the 11 are parity check positions. 

( pr ogr am haml 1 1 5 

{ def 1 ground ) 

( def 2 ph 1 a ) 

(def 3 phib) 

(def 4 phic) 

(def msg port input (5 6 7 8 9 10 11 12 13 14 15 16 17 18 19)) 

(def syndl signal output 20) 

(def synd2 signal output 21) 

(def synd3 signal output 22) 

(def synd4 signal output 23) 

(def 24 power ) 

( a 1 way s 

;set Isb of syndromet 
( cond 

((parity (bit 0 msg) (bit 2 msg) (bit 4 msg) (bit 6 msg) 

(bit 8 msg) (bit 10 msg) (bit 12 msg) (bit 14 msg)) 

( setq sy nd It)) 

( t 

( setq s ynd 1 f ) ) ) 

;set next bit of stndromei 

( cond (( par 1 ty (bit 1 msg) (bit 2 msg) (bit 5 msg) (bit 6 msg) 

(bit 9 msg) (bit 10 msg) (bit 13 msg) (bit 14 msg)) 

( setq synd2 t ) ) 

(t 

( setq synd2 f ) ) ) 

;set next bit of syndromet 

( cond (( par 1 ty (bit 3 msg) (bit 4 msg) (bit 5 msg) (bit 6 msg) 

(bit 11 msg) (bit 12 msg) (bit 13 msg) (bit 14 msg)) 
( setq synd 3 t ) ) 

(t 

( setq synd 3 f ) ) ) 

;set msb of syndromet 

< cond (( par 1 ty (bit 7 msg) (bit 8 msg) (bit 9 msg) (bit 10 msg) 

(bit 11 msg) (bit 12 msg) (bit 13 msg) (bit 14 msg)) 
( setq synd4 t ) ) 

( t 

( setq synd4 f ) ) ) ) ) 



Figure 6-6 Haml5-mac 
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Figure 6.7 Haml5. 
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TABLE 6.1 





THREE 


ERROR DETECTORS 






HAM3 


HAM7 


HAM 15 


Chip area 
Cmm**2H 


3.473 


4.812 


11.113 


Control path area 


2. 75 


1.918 


8 . 025 


Number pull ups 
C i n control 1 


9 


31 


71 


Number pads 


10 


15 


24 


NacPitts pwr. 

cw: 


. 03 1 94 


. 06094 


. 12265 


Powest pwr. (avg) 

cw: 


.02170 


. 03808 


. 06 1 9 1 


Powest pwr. (max) 

cw: 


.04341 


. 07379 


. 11746 


Max. delay 
Cns: 


51.54 


296. 23 


578. 42 


Max. -Frequency 
[MHz : 


1 9 . 40 


3. 34 


1 . 73 


Cycl es/resLil t 


1 


— 


— 


Thr oughput 
Cr esul ts/sec ] 


1 9 . 40M 


3. 34M 


1 . 73M 



So this method o-F parallel error detection appears 
-Feasible -For word lengths less than 16 bits. The speed is 
-Fast due to the chosen single-state MacPitts arch i tectur e 
(ALWAYS = one PROCESS with zero stack depth, or tor this 
purpose, a single-state FSM) . These chips are unclocked 



circuits. 



The throughput is not a -Function o-F the clock 



rate, but depends on the signal propagation time -from input 
to output. The propagation time sets the upper limit on 
throughput, and the capacitive leakage from the Weinberger 
array gates sets the lower limit on throughput- If the error 
detectors are used in a slow system, the outputs must 
therefore be latched to maintain valid logic levels- This is 
easily done with MacPitts, by SETQing the results to flags, 
and subsequently clocking the flags to output signal ports- 

B- HAMMING METHOD 15/4 ERROR CORRECTOR 

The previous section is only part of the story- Having 
located the error bit in the message, it must now be 
corrected- The decision of how to implement the error 
detector was a simple one, constrained by syntax- The error 
detector /corrector invites other methods of i mp 1 ementat i on - 
1 - Desi qn Con si der at i ons 

The message bit error is pointed at by the syndrome 
bits (the syndrome indicates the erroneous bit position)-* 
The error bit needs to be complemented, and the correct 
message results- The corrected message is then fed to the 
output ports- In this application, the extraneous check bits 
are discarded- The check bits (c) are used to encode the 
original message, and after reception and decoding tne> 
serve no purpose- 

The message error detection 
procedure can be reduced to thre?e steps:; 



and 



cor r ec t i on 



1. locate the error 

2. complement the error bit 

3. set the corrected output word bits 

The -first step is done with the error detection 
part o-f the algorithm. The second step is str ai ghtf orward in 
MacPitts. Either the output bit is the input message bit 
(the correct message bit case), or else the output bit is 
the complement o-f the cor r espond i ng message bit (the 
incorrect message bit case). The checking is done with the 
COND -form in MacPitts, 

The third step involves discarding the check bits, 
setting the correct output bits to the corr espondi ng input 
bit values, and sending the complement o-f the erroneous 
input bit to the cor r espondi ng output bit position, 

2, Prototype Desi qns 

Bit mani pul at i ons require Boolean data types, so 
-flags and signals are used. The -flags store the computed 
syndrome bits, and the signals are used -for input and 
output. Figure 6.8 shows the MacPitts dr i ver , ham3c . mac . 

There are three COND statements in ham3c.mac. The 
first two determine the results of the message parity 
checks, as in the error detection algorithms. The last COND 
sets the single message bit according to the result of the 
parity checks. If fsl (flag, syndl) is FALSE and fsO is 
TRUE, then the message bit is incorrect. The output is then 



set to the complement of the input bit value. 



If the form 



under the last COND is FALSE, then either there is no error 
in the message, or the one ot the two check bits is 
incorrect. In either case, the input message data bit is 
correct, so the output data bit (outO) is set to the 1 sb o-f 
the input message (msgO) ■ 

The -format o-f the input is three symbols, two o-f 
which are check bits and one data ( i n-f or mat i on ) bit. 



bit position 0 1 2 

bit -function c c i 



Only the last bit is returned from the error 
correction routine, the two check bits (inserted in the 
encoding of the message) are useless at this point- The last 
bit is the result of the error correction process, and is 
also ‘the output of the prototype design. The algorithm 
(ham3c-mac) has the syndrome bits declared as output 
signals. This is considered good programming form (MacPitts 
being both a language and a silicon compiler), and allows 
troubleshooting the algorithm at run time. The syndrome 
outputs are unnecessary for the error corrector chip, and 
are deleted after verification of the algorithm in the 
command i nt er pr eter - 

The resulting cifplot is Figure 6-9. The BIT 
organelles are absent, but two data path organelles 

cor r espondi ng to the flags fsl and fsO are i nstant i ated . 



These are the storage elements for the 



computed syndrome 



;HAM3C.MAC 

;MacPttts algorithm for single-error detection & correction. 
;Th1s algorithm serves as a paradigm for the Hamming single 
;error detection and correction problem. 

(program haml 3 
(def 1 ground) 

(def 2 phia) 

(def 3 phib) 

( def 4 ph 1 c ) 

;msg(n) : the Input datum and 2 parity check bits 

;out0 1 the corrected datum 

;synd(n)i the bit-checked Hamming error syndromes 
;fs(n) : Integer storage flags for the syndrome states 

(def msg2 signal input 5) 

(def msgl signal Input 6) 

(def msg0 signal Input 7) 

(def out0 signal output 8) 

(def syndl signal output 9) 

(def synd0 signal output 10) 

(def f s0 flag) 

( def f s 1 flag) 

( def 1 1 power > 

(always ;a 1 state FSM 

( cond. ;set the 1 sb of the error-bit syndrome: 

((parity msg0 msg2 ) 

( setq synd0 t ) ( setq fs0 t) ) 

( t 

(setq synd0 f ) (setq fs0 f> )) 

(cond ;set the msb of the error-bit syndrome: 

((parity msgl msg2 ) 

(setq syndl t ) (setq fsl t) ) 

( t 

(setq syndl f ) (setq fsl f) )) 

(cond ;the fs(n) flag states determine whether 

;the output datum requires correction. 

( ( and ( not fsl) f s0 ) 

(setq out0 (not msg0)) ) 

(t 

(setq out0 msg0))) 

) ) 



Figure 6.8 Ham3c . mac 




Figure 6.9 HamSc.cif 
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val Lies, 



The Weinberger array writes to and reads -from these 



•flags, as the algorithm suggests. An implication o-f this 
hybrid (data path and control path) structure is slower 
speed- This does not necessarily denote slower throughput, 
but slower signal speed across the logic circuitry. 

To . the right o-f the two -flags is a bank o-f three 
dual cascaded vertical inverters. This structure performs a 
function analogous to what the clock drivers do for data 
path registers (superbuf f ering and sequencing of the three 
phases) - 

Just as the error detector was tested for the three 
bit, seven bit, and 15 bit cases, so is the error corrector 
tested next for the case of a seven bit message (the error 
corrector incorporates the error detector in its logic). 

This section suggests a method whereby the designer 
can optimize the MacPitts chip. Three solutions to the error 
detecti on/correction problem are considered- Each is 
investigated, and the best solution is chosen as the 
archetype for the final 15 bit error corrector chip. The 
archetype is chosen on a seven bit basis instead of the 
simpler three bit chip. The seven bit error 
detector /correctors require more time to design and analyze, 
but their performance is more r epr esentat i ve of the desired 
chip's than is the three bit detec tor /cor rector . 

The first method is an elaboration on ham3c.mac. 



The algorithm is shown in higure 6.10, 



and the cifplot is 



Figure 6.11. This algorithm uses three -flags (-fsO, -fsl , and 
-fs2> to store the individual syndrome bits. The syndrome 
bits are subsequently tested in the Weinberger array, and 
used to selectively set the -four output bits o-f the 
corrected message (out6, out5, out4, and out2) . This 
solution has the advantage o-f clarity, and the disadvantage 
o-f slowness due to the hybrid structure and poly run 
lengths. In comparing this algorithm to Figure 6-8 
(ham3c.mac), it can be in-ferred that the number o-f COND 
statements in the error detection part o-f the algorithm is 
always the same as the number of parity checks needed. 
Similarly, the number of CONDs in the error correction part 
equals the number of output data bits- 

This version of the chip requires two clock cycles 
to produce an output (write the error syndromes to the 
flags, then read the flags to determine the correct output). 
The throughput is 318,180 results/sec. A result is taken to 
be a corrected data word, in this case, a four-bit word. 

Figure 6.12 shows an alternate solution, 
ham7cs-mac. This algorithm replaces the three flags with 
internal signals, i sO , isl, and is2. Internal signals in 
MacPitts have the advantage of not requiring time-consuming 
storage operations- This architecture reduces the error 
corrector to a combi nat i onal logic structure, implemented in 
the control path due to syntax (all Boolean forms). The 
algorithm has a similar structure to the previous one which 



used -flags to store the syndromes (Figure 6.10). 



There are 



three CDNDs to set the syndrome, and -four CONDs to set the 
output word. The question o-F internal timing arises: will 
MacPitts have the syndrome ready in time for the output word 
setting? The answer is yes, because the algorithm executes 
sequentially in the order written in the absence of 
par al 1 el i z i ng forms (CDND, PAR, PROCESS). 

This algorithm is faster than the previous one 
also. The throughput is 2,034,000 words/sec, almost six 
times as fast as the chip using flags to store the syndrome. 

Another solution considers the PAR form for 
paralleling the CONDS. An increase in speed results if the 
three CONDs which set the syndrome are paralled, and then 
the four CONDs which set the output are paralled with PAR. 
The throughput of this chip is 2,208,000 words/sec, slightly 
faster than the chip without PARs around the CONDs. This 
translates into larger structure (Table 6.2). Figure 6.14 is 
the MacPitts driver, ham7cr.mac, and Figure 6.15 is the 
ci f pi ot . 

This version of the error detect or /corrector is the 
archetype (chief example) for the 15 bit error 
detector /corrector . It was developed based on the three bit 
prototype (Figure 6.8), refined , tested with the MacPitts 
interpreter and Crystal , and is considered the optimal 
MacPitts par al 1 el -archi tec tur e solution for the seven bit 
correction problem. It serves as the model for building the 



;HAM7Cfth.MAC 

;Mamm1ng 7 bit message error corrector, FLAGS for syndromes 
(program ham7cfth 1 

(def I ground)(def 2 ph1a)(def 3 ph1b)(def 4 phic) 

(def msg0 signal Input 5)(def msgl signal input 6) 

(def msg2 signal input 7)(def msg3 signal input 8) 

(def msg4 signal input 9)(def msgS signal input 10) 

(def msgS signal input 11) 

(def outs signal output 12)(def out5 signal output 13) 

(def out4 signal output 14)(def out2 signal output 15) 

(def fs2 flag) ;FLAGS store syndromes* states: 

( def f s 1 flag) 

(def fs0 flag) 

( def 1 6 power ) 

( a 1 ways 

; set Isb of syndrome: 

( cond 

((parity msg0 msg2 msg4 msgS) 

( setq f s0 t ) ) 

(t 

( setq f 50 f ) ) ) 

;set middle bit of syndrome: 

( cond (( par 1 ty msgl msg2 msg5 msgS) 

( setq f 3 1 *t ) ) 

(t 

( setq f 5 1 f > ) ) 

;set msb of syndrome: 

( cond (( par 1 ty msg3 msg4 msg5 msgS) 

(setq fs2 t ) ) 

( t 

( setq f s2 f ) ) ) 

;The erroneous MESSAGE bits are corrected 



;Check data bit 2 (msg bit 3): 

( cond 

( (and (not fs2) fsl fs0 ) 

(setq out2 (not msg2) ) ) 

( t 

( setq out2 msg2 ) ) ) 

;Check data bit 4 (msg bit 5): 

( cond 

((and fs2 (not fsl) fs0 ) 
(setq out4 (not msg4) ) ) 

( t 

( setq out4 msg4 ) ) ) 

;Check data bit 5 (msg bit 6): 

( cond 

( (and f s2 fsl ( not f s0) ) 

(setq out5 (not msg5) ) ) 

( t 

( setq out5 msg5 ) ) ) 

; Check data bit S (msg bit 7): 

( cond 

((andfs2fsl fs0 ) 

(setq outS (not msgS) ) ) 

( t 

(setq outs msgS)) ))) 



Figure 6.10 Ham7c-f.mac 
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Figure 6-10 Ham7cf-ci-f 
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;HAM7Cs,MAC 

;Hamm1ng 7 bit message error corrector , S 1 GNALS 
(program ham7cs 1 

(def 1 ground)(def 2 ph1a)(def 3 phfb)(def 4 phic) 

(def msgJ3 signal input 5)(def msgl signal input 6) 

(def msg2 signal input 7)(def msg3 signal input B) 

(def msg4 signal input 9)(def msgS signal input 10) 

(def msgS signal input 11) 

(def outs signal output 12)(def out5 signal output 13) 
(def out4 signal output 14)(def msg2 signal output 15) 
;3 signals needed to pass the syndrome’s bits: 



(def js2 signal internal) 

(def isl signal internal) 

(def i s0 signal internal) 

{ def 1 7 power ) 

(always 

;set 1 sb of syndrome: 

( cond 

((parity msg0 msg2 
(setq is0 t ) > 

( t 

( setq i S0 f ) ) ) 

;set middle bit of syndrome 
( cond ( { pa r i t y msgl msg2 msg5 
( setq i s 1 t ) ) 

( t 

( setq i s 1 f ) ) ) 

; set msb of syndrome: 

( cond ({ par i ty msg3 msg4 msg5 
(setq is2 t ) ) 

( t 

( setq i s2 f ) ) ) 

;Check data bit 2 (msg bit 3): 

( cond 

((and (not is2) isl i s0 
( setq out2 ( not msg2 ) ) 

( t 

(setq out2 msg2)) 

;Check data bit 4 (msg bit 5): 

( cond 

( ( and i s2 (not isl) i s0 
(setq out4 (not msg4) ) 

( t 

( setq out4 msg4 ) ) 

;Check data bit 5 (msg bit 6): 

( cond 

((and is2 isl (not is0) 

(setq outs (not msgS) ) 

( t 

(setq out5 msgS)) 

;Check data bit 6 (msg bit 7): 

( cond 

( ( and i s2 isl i s0 
(setq outs (not msgS) ) 

( t 

( setq outs msgS ) ) 



;Use SIGNALS instead of FLAGS 



msg4 msgS) 
msgS ) 
msgS ) 

) 

) 

) 

) 

) 

) 

) 

) 

) 

) 

) 

) ) ) 



Figure 6.12 Ham7cs.mac 
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F i gure 6.13 



Ham7cs. mac 
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;HAM7Cr .MAC 

;Hamming 7 bit message error corrector, using PAR 
{program ham7cr 1 

(def I ground)(def 2 ph1a>(def 3 ph1b)(def 4 phic) 

(def msg0 signal Input 5>(def msgl signal Input 6) 

(def msg2 signal Input 7>(def msg3 signal Input 8) 

(def msg4 signal Input 9)(def msgS signal Input 10) 

(def msgS signal Input 11) 

(def outs signal output 12)(def outS signal output 13) 

(def out4 signal output 14)(def out2 signal output 15) 

;3 signals needed to pass the syndrome’s bitsi 
(def 1s2 signal Internal) 

(def Isl signal Internal) 

(def 1 s0 signal Internal) 

(def 17 power ) 

(always ;do every elk cycle 

;set Isb of syndromei 

(par 

(cond ;PARallel parity checking, setting 

((parity msg0 msg2 msg4 msgS) 

(setq 1 s0 t ) ) 

( t 



( setq 1 s0 f ) ) ) 

; set middle bit of syndrome: 

( cond (( par 1 ty msgl msg2 msg5 msgS) 
( setq 1 s I t ) ) 

( t 



(setq Isl f ) )> 

;set msb of syndrome: 

{ cond ( ( pa r 1 ty msg3 msg4 msgS msgS) 
( setq 1 s2 t ) ) 

( t 

( setq 1 s 2 f ) ) ) ) 

;Check data bit 2 (msg bit 3): 

( par 
( cond 

((and (not 1s2) Isl 1s0 ) 

(setq out2 (not msg2) ) ) 

(t 



( setq out2 msg2 ) ) ) 

;Check data bit 4 (msg bit 5): 

( cond 

{(and 1s2 (not Isl) 1 s0 ) 
(setq out4 (not msg4) ) ) 

( t 



(setq out4 msg4)) ) 

;Check data bit 5 (msg bit S): 

{ cond 

{(and 1s2 Isl (not 1s0) ) 

(setq out5 (not msg5) ) ) 

( t 



(setq outs msgS)) ) 

;Check data bit S (msg bit 7): 

( cond 

( (and 1 s2 1 s 1 1 s0 ) 

(setq outs (not msgS) ) ) 

{ t 

(setq outs msgS)) )))) 



Figure 6.14 Ham7cr.mac 




Figure 6.15 



Ham7cr . ci -f 




15 bit machine (the seven bit model is easier to analyze in 
the interpreter, and with Crystal and Esim). 

It is impractical to do the proceeding design 
process beginning with a 15 bit machine. The 15 bit message 
cannot be tested in the interpreter (all the inputs and 
outputs will not -fit on the VT-100 screen), and Caesar and 
Crystal analysis is -far more complicated with large 
structures. It is better to optimize with a smaller model, 
and then extend the results to achieve the desired chip. 

Table 6.2 is a parametric comparison o-f the three 
Hamming error detector /corrector chips. The reason for the 
choice of ham7cr.mac is clear from previous discussion and 
these statistics. 

TABLE 6.2 

CHIP PARAMETRIC COMPARISON 





HAM7Cf 


HAM7CS 


HAM7Cr 


Area Cmm**2] 


7.003 


6 . 305 


6. 187 


Power CWH 


. 102 


. 0931 


. 093 1 


Delay Cns] 


1581.37 


491.64 


452. 94 


Speed CMHz] 


. 6324 


2.034 


2.208 


Cycl es/res. 


r? 


1 


1 


Throughput C res/s] 


. 316M 


2.2034M 


2. 20SM 


Speed /area 
CMHz /mm**2H 


. 0903M 


. 3226M 


. 3579M 


Density 
C tran/mm**2Il 


53. 6 


45.7 


46. 6 
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The reason -for the choice o-f ham7cr as the model is 



seen in Table 6,2- The chip (Ham7cr) is smaller and -faster 
than its predecessor s. It has the highest throughput o-f all 
the seven bit correctors. The result of using the PAR form 
is seen by comparing the speed/area ratios of ham7cs and 
ham7cr. PAR translates into more decisions done 
simultaneously, and the decisions are done faster 
(speed/area is greater). The result of storing the syndrome 
bits in flags (ham7cf) is shown in its comparat i vel y low 
throughput and low speed/area figures. 

A functional summary of the three prototype 
candidate algorithms (flowcharts and resulting floorplans) 



is given 


in Figures 6.16 - 6.21. 


4. 


Hamminq 15/4 Error Corrector 

The 15 bit error corrector is designed after the 


PARal led 


COND version of the ham7 algorithm, ham7cr.mac 


(Fi gure 


6.14). As explained above, the number of CONDs 


expected 


is the sum of the number of syndrome bits and the 



number of corrected data bits out. There are four syndrome 
bits for the 15/4 code, and 11 corrected data bits out, for 
a total of 15 CONDs in the algorithm. Figure 6.22 shows 
haml5dc-mac. The algorithm structure is similar to ham7, 
except for the pin naming which has been shortened to make 
it easier to enter the data for analysis (Crystal , Caesar 
labels, esim). There are four parity checks across the bits 
as described in the paragraph on error detection. The parity 



?1 4 





SET SYNDROME 




SET OUTPUT 
BITS 




Figure 6-16 Ham7cT Flowchart 




GND 


ph 1 a 


ph lb 


phi c 


MSG0 


riSGi 




iSG2 



^1SG3 



iSG4 



iSG5 



iSG6 



DUT2 



Udd 


0UT6 


OUTS 


0UT4 



Figure 6-17 Ham7cf Floorplan 




Figure 6.18 Ham7cs Flowchart 





Figure 6.19 Ham7cs Floorplan 




Figure 6.20 Ham7cr Flowchart 
219 




Figure 6.*:1 Ham7cr Floorplan 



checks result in tour 



syndrome internal signals. 



The 



internal signals translate to teedback within the Weinberger 
array. Atter the bit error is identitied by the syndrome 
pattern, it is corrected. There are 11 CONDs which 
accomplish the bit-wise correction ot the output word, one 
tor each bit which is not an encoding bit (positions 0, 1, 
3 , and 7) . 

The algorithm compiled to cit, as expected. The 
sice ot the Weinberger array (155 columns) required a long 
time tor compilation, appro',! i matel y 3.5 hours (at night) on 
the VAX 11/780 at Naval Postgraduate School. The resulting 
labelled citplot is shown in Figure 6.23. The circuit is an 
expansion ot the seven bit Hamming error correctors, but 
larger. The seven bit chip has seven CONDs, the 15 bit chip 
has 15. The result ot COND in the algorithm is NOR gates in 
the Weinberger array. The chip measures 5.1371 mm b'/ 4.005 
mm, tor an area ot 20.57 sq. mm. There are 238 pull up 
transistors, so the Powest— cal cul ated power dissipation ot 
0.1229 W (average) is no surprise (MacPitts estimates the 
power consumption as 0.16086 W) . The Powest estimated 
maximum dc power is 0.2321 W. Crystal timing analysis 
predicts a maximum delay ot 1222.94 ns, tor a maximum data 
rate ot 818 kHz and theretore a maximum throughput ot 
818,000 results/sec (8,998,000 bits/sec). The circuit 
density is sparse, as seen in the citplot, and the average 
density is approximately 37 transi stors/sq . mm. The sparsity 
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;HAMl5dc.MAC 

;Hammtng 15/4 error detector / cor rector 
(program hamlSdc 1 

(def 1 ground)(def 2 phta)(def 3 phib)(def 4 phic) 



(def 


m0 


signal 


1 nput 


5 )(def 


ml 


s 1 gna 1 


input 6 


) 


(def 


m2 


signal 


1 nput 


7 )(def 


m3 


signal 


Input 8 


) 


( def 


m4 


signal 


1 nput 


9 )(def 


m5 


signal 


Input 10 


) 


(def 


m6 


signal 


1 nput 


1 1 ) (def 


m7 


signal 


Input 12 


) 


(def 


m8 


s 1 gna 1 


1 nput 


13)(def 


m9 


signal 


Input 14 


) 


(def 


ml0 


signal 


1 nput 


15)(def 


ml 1 


s f gna 1 


1 nput 1 6 ) 


(def 


ml2 


signal 


1 nput 


17) (def 


m 1 3 


signal 


Input 18) 


(def 


ml4 


s 1 gna 1 


1 nput 


19 ) 










( def 


s 1 4 


s 1 gna 1 


output 


20) (def 


s 1 3 


s 1 gna 1 


output 


21 ) 


(def 


sl2 


signal 


output 


22 ) (def 


si 1 


signal 


output 


23) 


( def 


sl0 


signal 


output 


2 4 ) ( def 


s9 


signal 


output 


25) 


( def 


s8 


signal 


output 


26 ) (def 


s6 


signal 


output 


27) 


(def 


s5 


signal 


output 


28)(def 


s4 


signal 


output 


29) 


( def 


s2 


signal 


output 


30) 










(def 


3 1 power ) 














( def 


1 S0 


signal 


1 n ter na 1 ) ( def 


Isl 


signal 


1 nter na 1 ) 


( def 


1s2 


s 1 gna 1 


1 n ter na 1 ) ( def 


1 s 3 


signal 


1 nter na 1 ) 



(always 

(par ;PARal1el syndrome setting: 

;set Isb of syndrome: 

( cond 





( (parity 


m0 


m2 


m4 


mS 


m8 


m 1 0 


m 1 2 


m 1 4 ) ( setq 


1s0 


t ) ) 




( t 
















( setq 


1 s0 


f ) ) ) 


; set 


middle bit 


of 


syndrome i 












( cond 


( (pari ty 


ml 


m2 


m5 


m6 


m9 


m 10 


ml3 


ml 4 ) ( setq 


Isl 


t ) ) 




( t 
















( setq 


Isl 


f ) ) ) 


; set 


next bit 


syndrome 


: 














( cond 


( (parity 


m3 


m4 


m5 


m6 


mil 


ml2 


m 1 3 


m 1 4 ) ( setq 


1s2 


t ) ) 



( t ( setq 1 s2 f ) ) ) 

;set msb syndrome: 

( cond 



({parity m7 m8 m9 ml0 mil ml2 ml3 ml4)(setq 1s3 t)) 

(t (setq ls3 f ) ) ) ) 

;check & set output data bits: 

(par ;PARallel check/set operations: 

; da ta bit 2 ( m3 ) 

( cond 

((and (not 1s3) (not 1s2) 1sl 1s0) 

( setq s 2 ( not m2 ) ) ) 

( t ( setq s2 m2 ) ) ) 

;data bit 4 (m5) 

( cond 

((and (not 1s3) 1s2 (not Isl) 1s0) 

( setq s 4 ( not m4 ) ) ) 

( t ( setq s4 m4 ) ) ) 

; data bit 5 ( mS ) 

( cond 

{(and (not 1s3) ls2 Isl (not 1s0)) 

( setq s5 ( not m4 ) ) ) 

( t ( setq s5 m5 ) ) ) 

; da ta bit 6 ( m7 ) 

( cond 



Figure 6.22 HamlScJc.mac 



((and (not 1s3) ts2 tsl 1s0> 

( setq s 6 ( not m6 ) ) ) 

( t ( setq s6 m6 ) ) ) 

;data bit 8 (m9> 

( cond 

((and 1s3 (not 1s2) (not tsl) 1 sj 8) 
( setq s8 ( not m8 ) ) ) 

( t ( setq s8 m8 ) ) ) 

;data bit 9 (ml^) 

( cond 

((and 1s3 (not 1s2) Isl (not 1s0)) 
(setq s9 (not m9 ) ) ) 

( t ( setq s 9 m9 ) ) ) 

;data bit 10 (mil) 

( cond 

((and 1s3 (not 1s2) Isl 1s0) 

(setq sl0 (not ml0))> 

( t ( setq s 1 0 m 1 0 ) ) ) 

;data bit 11 (ml2) 

( cond 

((and 1s3 1s2 (not Isl) (not 1s0>) 

( setq s 1 1 ( not mil))) 

( t ( setq s 1 1 mil))) 

; data bit 12 ( m 1 3 ) 

( cond 

((and 1s3 1s2 (not Isl) 1s0) 

(setq sl2 (not ml2))) 

( t( setq sl2 ml2 ) ) ) 

; data bit 13 ( m 1 4 ) 

( cond 

((and 1s3 1s2 Isl (not 1s0)) 

(setq sl3 (not ml3))> 

( t ( setq s 1 3 ml 3 ) ) ) 

;data bit 14 (ml5) 

( cond 

((and 1s3 1s2 Isl 1s0) 

(setq sl4 (not ml4)>) 

(t(setq sl4 ml4) )))))) 



Figure 6.22 HamlSdc.mac (continued) 




Figure 6.23 HamlSdc.ci^ 








is due in part to the absence of a data path. If just the 
Weinberger array is considered, however, the circuit density 
is approK i mat el y 100 transi stors/sq . mm. Appendi:-: D contains 
the script recording of the compilation of haml5dc.mac. 

The transistor densities given in Table 6.2 are 
derived from MacPitts chips. A comparison with standard 
library cells densities derived from Newkirk and Matthews 
CRef. 12] may be illuminating. 

TABLE 6.3 

TRANSISTOR DENSITY COMPAR I SON 
CIRCUIT DENSITY C tr an . /mm**2 ] 



Ham7Cf 




54 


Ham7Cs 




46 


Ham7Cr 




47 


CountUDRestore 
CRef. ll:p. 7<?: 


457 


COUNT 

CRef. ll:p. 


67] 


753 


ALU 

C Ref . 1 1 : p . 


20] 


616 


ADDER 

C Ref . 1 1 : p . 


10] 


691 



So the MacPitts chips are far less dense than even 
the library macro cells. The Newk i r k-Mat hews cells only 
consider the cell itself, and not the chip, which was the 
basis on which the MacPitts densities were calculated. 



Neverthel ess , 
di -f *f erence 
approx i matel y 



a density 'factor o-f 10 is a 
(the MacPitts chips in this 
SOX circuitry, and SOX white 



consi derab 1 e 
chapter are 
space, so a 



density factor of five is still significant). 



VII. CONCLUSION 



A. SUMMARY 

This thesis has considered the effects of syntax on 
circuit structure in the MacPitts silicon compiler. The 
combi nati onal logic structure is explicitly specified by 
syntax in the data path, and the appropriate behavior 
results. The circuit behavior is explicitly specified in the 
control path, and the combi nat i onal logic structure (a 
Weinberger array) results. 

Combinational logic structures in the data path comprise 
adjoined MacPitts macros (organel 1 es ) . Combinational logic 
structure in the control path, however, is always done in a 
Weinberger array. The poly runs internal and external to the 
Weinberger array make combi nat i onal logic operate slower 
there than in the equivalent circuit in the data path. 
Parallelism of logical functions is possible in MacPitts by 
using the COND and PAR forms. These paralleling forms 
usually equate to a speed /area tradeoff on the chip. 

Sequential logic in MacPitts is implemented as a Mealy- 
type FSM. The state registers store the present state, and 
receive present input information from both the control path 
and the sequencer tail organelle. The data path width, as 
declared in the PROGRAM statement, determines the number of 
states possible for the FSM. This must be determined by the 



designer a priori, 



and explicitly stated in the PROGRAM 



statement. The long poly runs between the data path and 
control path cause a slow speed in the MacPitts FSM , as 
compared to the handcrafted equivalent. The 8s 1 ratioed 
SLiperbuf f ered input pads add to this slowness, because of 
the number of NOR gates one pad may have to drive in the 
Weinberger array. 

The FSM architecture and its attendant Mealy sequencer 
organelles are implicitly specified by the PROCESS 
statement- Each process is an independent entity in 
MacPitts, with its own organelles and wires. Processes do 
not communicate internally with each other. The PROCESS form 
is another method of parallelism possible in MacPitts- All 
PROCESSES embraced by PROGRAM execute in parallel, at the 
speed of the slowest-executing process. This capability 
makes MacPitts well-suited for design of contr ol 1 er-or i ented 
chi ps. 

The chip design process with MacPitts can be understood 
initially as algorithmic opti mi zat i on . The test algorithm is 
written, tested in the interpreter, and compiled to cif. 
Then an expanded version of the test algorithm is written 
and tested in the interpreter. The expanded version is 
compiled to cif, a circuit extraction is made, and the 
electrical char act er i st i cs and speed of the chip are 
determined- Alternate solutions are then considered, and 
tested in the same fashion- The best of these is chosen as 



the archetype -for the desired chip. The archetype must have 
su*f -f i ci ent 1 y -few signals, ports, registers, and -flags to 
permit testing in the interpreter (a maximum o-f 36). The 
algorithm is then expanded again to cover the desired chip 
•function. The -final algorithm is compiled to ci-f, a circuit 
extraction is made, and then the chip is tested 
el ectr i cal 1 y . I-f there are too many variables to permit 
command interpreter display, the algorithm is tested with a 
switch-level simulator (this exercises both the algorithm 
and the circuit). Further analyses with a power estimator 
and a timing analyzer are done to see that the chip operates 
within speci -f i cati ons. I-f the chip operates too slow, 
parallelism should be applied to the algorithm where 
possible, in an attempt to trade speed -for silicon area. 



B. RECOMMENDATIONS 

This thesis also investigated a number o-f MacPitts 
errors and shortcomings. The -following r ecommendat i ons 
should be considered: 



1. Have the the light controller chips -fabricated by 

MOSIS -for testing at Naval Postgraduate school , and 
compare with the results -from Crystal . 



The Weinberger array errors as depicted in 
Chapter II are thought to result -from incorrect 
installation o-f MacPitts under Unix 4.2. It would 
be -fruit-ful to search -for a Unix-dependent roundo-f-f 
error in the i nstant i ati on ot par t i al -gat e-i nput- 
ground-r i ght and parti al -gate-i nput-gr ound-1 e-f t . 
The poly i nter connect i ons between data and control 
also sut-fer a lateral di spl acement /gap error, and 



the solution to the partial gate problem is likely 
to solve this one also. Similar errors were also 
noted in the data path, usually between vertical 
metal lines and horizontal Vdd/GND busses- 



New Mead-Conway organelles (c-f. Chapter III) should 
be tried as replacements -for the MacPitts data path 
organelles. This will require comparison between 
similar structures with Powest and Crystal , and 
selection oi the better circuit. MacPitts will 
connect the new organelles properly i -f the pitch is 
preserved . 



The error of shorted flag traces occurs almost 
every time a flag is declared. The vertical flag 
lines intersect the horizontal clock traces at a 
via cut, which shorts the flag signal and does not 
permit it to pass to control . The solution to this 
error is best solved by a conditional test in the 
routing algorithm. If the flag traces run close to 
the Vdd/ground comb, then the traces must be moved 
in towards the center of the chip. 



The possibility of replacing the slow Weinberger 
array with a PLA should be considered- This 
solution will entail a complete rewrite of the 
control.lisp source file, and major modification to 
other files which depend on or interact with 
control . 1 i sp - A study of plague and plagen (or 
eqntott and tpla) is the best place to start, with 
a view towards replacing the Weinberger array with 
a compact PLA. The difficulty will lie in the 
interface between the PLA logic equation 
speci f i cat i on (in plague or eqntott) and the 
MacPitts algorithmic language. 



The problem of vestigal instantiation (sequencers, 
unconnected vertical poly runs from the data path) 
could be solved with a simple test using list 
processing primitives- If the organelles or wires 
are not needed, then skip the instantiation 
process- 



The problem of the unconnected Vdd bus only occurs 
in very small chips, but should be simple to 
correct- A metal routing up and to the left, to 



connect to the Vdd comb is required. The simple 
solution is to explicitly speci-fy a connecting wire 
in the CLL-1 i ke language used in the MacPitts 
source code. The more instructive solution is to 
write the Franz LISP code to decide i -f a jumper 
wire is needed, and i-f so, to create one. 

A menu invoking Crystal, Esim, Powest , and Mextra 
would speed up the design cycle. The menu could be 
incorporated in MacPitts, but would probably be 
just as good external to MacPitts. A timing 
analysis is necessary in the compilation o-f the 
chip, however. If it had existed during the Hamming 
15/4 error corrector example (Chapter VI), the 
choice of an archetype chip would have been 
si mpl er . 



The VT-100 terminal screen is too small to display 
the interpreter session of all the signals, flags, 
registers, and ports which occur on even a 
moderate-sized MacPitts chip. A windowing 
capability is needed. The source file 
i nterpret . 1 i sp contains the command interpreter 
logic. The interpreter is functionally a dynamic 
debugger, similar to those in CP/M or VMS (but 
without the ability to change the source code). The 
interpreter has a very slow response time to 
terminal inputs for all but the simplest chip 
algorithms, and it would be useful to speed it up 
also if other modifications are planned. 



SPICE would be a valuable addition to timing 
analysis. Currently, SPICE 2g6 is not installed on 
the VAX-1 1/7S0 at Naval Postgraduate School. A plot 
of the SPICE output is also desired, but not 
available under the currently installed version of 
Unix 4.2. 



The capability to scale the MacPitts designs to 
sizes other than multiples of 200 or 250 
centimicrons is needed for future applications. The 
ability to scale in multiples of 25 centimicrons is 
suggested, where the designer chooses the option at 
compile time in the MacPitts <options> field. 

MacPitts currently places pads on only three sides 
of the chip frame. A better design would permit 



pads to be placed on all -four sides of the chip. 
This would also allow faster chips, due to 
shortened inter— chip wires. 

The capability of automatic test vector generation 
and evaluation is lacking. The command interpreter 
should be able to access an existing file for 
testing and write the results of the tests to 
another file. 



The ability to display transistor densit'/ as one of 
the compiler statistics should be i ncor por ated - 
This would be a simple task, since hlacPitts already 
computes the chip dimensions and the number of 
transistors, and writes each of these values to the 
statistics output file. 



A serial implementation of the Hamming 15/4 error 
detector/ corrector should be attempted using 
primitive polynomials CRef. 131, CRef- 5:pp- 200]- 
The throughput should be compared to the parallel 
15/4 error corrector. The interesting problem is to 
solve the differing bandwidths at the input and 
output of the shift register. HacPitts may not be 
able to cope with this requirement, and will likely 
be slower than the parallel architecture (in the 
throughput sense) regardless. 



A hacPitts prototype FIR or IIR digital filter- 
should be attempted- The first model should be an 
FIR four-bit prototype, and this algorithm can then 
be expanded to the floating point version of larger 
word length. An excellent reference for the 
designer is CRef- 14:pp- 541], where the 
algorithmic aspects of digital filter design are 
ex pi ai ned . 



Faster 
t er mi nal 
terminal 



graphics are required for 
(Caesar). A better 
should be considered- 



the VLSI 
( i - e- , 



graph i cs 
qui cker ) 



The Backus-Naur file (BNF) included with the 
MacPitts source code specifies allowed algorithmic 
syntax- The macro and lambda forms should be 
investigated with a view to i ncor porat i ng macros 
into the algorithms- 



19 . 



It would speed up the design time and con-fer added 
versatility on MacPitts i -f the input port width 
could be specified as a variable. The word lengths 
would then be assigned according to another sin 
statement in the MacPitts algorithm. For instanc 

(def face port input (*> ) 

(def data word width 16 ) 



would assign a 16 -bit width to the variable <face> 
and to any other occurrences of the asterisk. 
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APPENDIX A 



CHAPTER III LISTINGS 



{ ( ( des t 1 nat Ion z ) 
( sour ce a ) 

( source b ) 
(source c ) 



(source d ) 

( sour ce e ) 

( 1 ogo f t vand ) 

( word- 1 engt h 1 ) 
(ground 1 ) 



t nput 
input 
t nput 
1 nput 
1 nput 
output 



( port 
( port 
( port 
( port 
( port 
( port 
{ ph 1 a 
( ph t b 
( ph 1 c 
( power 
n t 1 

( ( orga nel 1 e 
( or ga ne 1 1 e 
( or ga ne 1 1 e 
( or gane 1 1 e 
( port-output 
n 1 1 



a 
b 
c 
d 
e 
z 

8) 

9 ) 

10 ) 
11 )) 



( 2 ) ) 

{ 3> ) 

( 4 ) ) 

(5) ) 

( 6 ) ) 
(7) ) 



and -1 ( ( ( por t- t np ut d) 

and -2 ((( por t- 1 nput c) 
and -3 ((( por t - 1 nput b) 
and -4 ((( por t - 1 nput a) 

2 ( ( ( internal 4 ) ) ) ) ) 



( port- tnput e ) ) ) ) 
( i nter na 1 1 ) ) ) ) 

(Internal 2)))) 

( i nter na 1 3 ) ) ) ) 



( ( 10 (phic) ) 

(9 (phib ) ) 

(8 (ph la ) ) 

(1 (ground ) ) 

(11 ( power ) ) 

(2 (input (a 0) (port-input a 0))) 

(3 (input (b 0) (port-input b 0))) 

(4 (input (c 0) (port-input c 0))) 

(5 (input (d 0) (port-input d 0))) 

(6 (input (e 0) (port-input e 0))) 

(7 (outputs (z 0) (port-output z 0))))) 



Data Path Five Input AND Gate .obj File 



Statf 5t fc 
Statistic 
Hera 1 d - 
Herald - 
Herald - 
Herald - 
Herald - 
Hera 1 d - 
Herald - 
Hera 1 d - 
Hera 1 d - 
Herald - 
Statistic 
Statistic 
Statistic 
Hera 1 d - 
Her aid - 
S tat I s 1 1 c 
Herald - 
Sta 1 1 s 1 1 c 
Statistic 
Statistic 
Hera 1 d - 
Herald - 
Herald - 
Herald - 
Herald - 
Herald - 
Statistic 
Herald - 
Hera 1 d - 
Hera 1 d - 
Herald - 
Herald - 
Hera 1 d - 
Statistic 
Her aid - 
Statistic 
Statistic 
Statistic 
Stat I s t I c 



- for project fivand 

- options: (herald opt-d opt-c stat obj clf nologo) 

68, 58 - Reading source file - fivand. mac 

72 , 58 - Reading library from - / v 1 s I /macp 1 1 / 1 Ibr ary 
901, 611 - Processing definitions 

903, 61 1 - Evaluatirjg eva 1 s 

986, 611 - Expanding macros 

989, 611 - Extracting sources 

990, 611 - Extracting destinations 

991, 611 - Extracting labels 
991, 611 - Extracting sequencers 

991, 611 - Extracting flags, data-path, control, and 

- Maximum control depth Is 0 

- Number of gates I s 0 

- Data-path has 5 Units 

1383. 901 - Outputing .obJ file 
1413, 901 - Extruding gates 

- Control has 0 columns 
1516, 997 - Extruding straps 

- Circuit has 98 transistors 



- Control has 0 tracks 

- Power consumption Is 0.038114 Watts 

1679, 1095 - Laying out data-path 

1815, 1192 - Organelle unit# 1 bit 0 

2014, 1290 - Organelle unit# 2 bit 0 

2168, 1391 - Organelle unit# 3 bit 0 

2332, 1498 - Organelle unit# 4 bit 0 

2385, 1498 - Organelle unit# 5 bit 0 

- Data-path Internal bus uses 6 tracks 

2539, 1600 - Laying out control 

2542, 1600 - Laying out flags 

2543, 1600 - Laying out river 

2545, 1600 - Laying out wing 

2547, 1600 - Laying out skeleton 

2683, 1699 - Laying out pins 

- Dimensions are 1.805000 mm by 1.872500 mm 
5299, 3105 - Outputing .clf file 

- Memory used - 357K 

- Compilation took 1.534722 CPU minutes 

- Garbage collection took 0.893333 CPU minutes 

- For a total of 33 garbage collections 



pins 



Script of Compilation of Data Path Five Input AND Gate 



94 


41 


64200 79400; 


94 


42 


82200 79400; 


94 


43 


100200 79400; 


94 


a 46300 79600; 


94 


4 1 


64300 79600; 


94 


42 


82300 79600; 


94 


43 


100300 79600; 


94 


54 


48000 79900; 


94 


41 


54200 79900; 


94 


56 


66000 79900; 


94 


42 


72200 79900; 


94 


56 


84000 79900; 


94 


43 


90200 79900; 


94 


57 


102000 79900; 


94 


z 


108200 79900; 


94 


54 


49800 80400; 


94 


41 


55500 80400; 


94 


55 


67800 80400; 


94 


42 


73500 80400; 


94 


56 


85800 80400; 


94 


43 


91500 80400; 


94 


57 


103800 80400; 


94 


z 


109500 80400; 


94 


a 


46300 80400; 


94 


4 1 


64300 80400; 


94 


42 


82300 80400; 


94 


43 


100300 80400; 


94 


Vdd 52000 80600; 


94 


Vdd 57700 80600; 


94 


Vdd 70000 80600; 


94 


Vdd 75700 80600; 


94 


Vdd 88000 80600; 


94 


Vdd 93700 80600; 


94 


Vdd 106000 80600 


94 


Vdd 1 1 1 700 80600 


94 


54 


49800 81600; 


94 


4 1 


55500 81600; 


94 


55 


67800 81600; 


94 


42 


73500 81600; 


94 


56 


85800 81600; 


94 


43 


91500 81600; 


94 


57 


103800 81600; 


94 


z 


109500 81600; 


94 


54 


49800 82400; 


94 


4 1 


55500 82400; 


94 


55 


67800 82400; 


94 


42 


73500 82400; 


94 


56 


85800 82400; 


94 


43 


91500 82400; 


94 


57 


103800 82400; 


94 


z 


109500 82400; 


94 


z 


1 16500 83600; 


94 


e 


97200 84900; 


94 


z 


109400 84900; 


94 


z 


1 13500 84900; 


94 


d 


79200 86100; 


94 


43 


91400 86100; 


94 


43 


95500 86100; 



94 


GND 


41500 


71700; 


94 


Vdd 


52000 


76800; 


94 


Vdd 


67700 


76800; 


94 


Vdd 


70000 


76800; 


94 


Vdd 


75700 


76800; 


94 


Vdd 


88000 


76800; 


94 


Vdd 


93700 


76800; 


94 


Vdd 


106000 


76800 


94 


Vdd 


1 1 1 700 


76800 


94 


b 43200 76900; 


94 


c 61200 76900; 


94 


d 79200 76900; 


94 


e 97200 76900; 



94 z 1 13500 76900; 

94 b 43200 76900: 

94 GND 48000 76900; 
94 c 61200 76900; 

94 GND 66000 76900; 
94 d 79200 76900; 

94 GND 84000 76900; 
94 e 97200 76900; 

94 GND 102000 76900; 



94 


2 


1 1 3500 


76900; 


94 


b 


46300 


77100 




94 


c 


64300 


77100 




94 


d 


82300 


77100 




94 


e 


100300 


77100; 


94 


b 


45000 


77 100 




94 


c 


63000 


77 100 




94 


d 


8 1000 


77 100 




94 


e 


99000 


77100 




94 


b 


46300 


77800 




94 


c 


64300 


77800 




94 


d 


82300 


77800 




94 


e 


100300 


77800; 


94 


GN 


D 53700 78100 



94 GND 71700 78100 
94 GND 89700 78100 



94 GND 107700 78100; 



94 


a 


41500 7 


8600; 


94 


41 


59500 


78600; 


94 


42 


77500 


78600; 


94 


43 


95500 


78600; 


94 


a 


41500 7 


8600: 


94 


45 


48000 


78600; 


94 


41 


59500 


78600; 


94 


47 


66000 


78600; 


94 


42 


77500 


78600; 


94 


49 


84000 


78600; 


94 


43 


95500 


78600; 


94 


51 


102000 


' 78600 


94 


z 


1 16500 


78900; 


94 


z 


1 1 6500 


78900; 


94 


54 


53200 


79300; 


94 


55 


71200 


79300; 


94 


56 


89200 


79300; 


94 


57 


107200 


^ 79300 


94 


a 


46200 79400; 



94 


c 


61200 


87400; 


94 


42 


73400 


87400 


94 


42 


77500 


87400 


94 


b 


43200 


88600; 


94 


41 


55400 


88600 


94 


4 1 


59500 


88600 


94 


a 


4 1500 


89900; 
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Crystal , v . 2 
t bufid Bander. stm 
[J0r;00.1u 0i00.2s 21k] 
a inputs abode 
C0I00.0U 0:00. 0s 30k] 

; outputs z 

C0:00.0u 0:00.0s 30k] 

j delay a -1 0 

Marking transistor flow... 

Setting Vdd to 1... 

Setting GND to 0... 

(9 stages examined.) 

C 0 : 00 . I u 0:00.1s 31k] 
t delay b - 1 0 
(1 stages examined.) 

[0:00. 0u 0:00.0s 31k] 

: delay c - 1 0 
(1 stages examined.) 

[0:00. 0u 0:00.0s 31k] 
t delay d - 1 0 
(1 stages examined.) 

[0:00. 0u 0:00.03 31k] 

: delay e - 1 0 
<1 stages examined.) 

C0;00.0u 0:00.0s 31k] 
t critical 

Node r is driven low at 86.01ns 



. 


. . through 


fet 


at 


(541, 397) 


to 


GND 


after 


57 


is dr ! ven 


h i gh 


at 


70.55ns 










. . through 


fet 


at 


<519, 405) 


to 


Vdd 


after 


43* 


is dr 1 ven 


1 ow 


at 


61 . 39ns 










. . through 


fet 


at 


<451, 397) 


to 


GND 


after 


56* 


is dr i ven 


high 


at 


50.40ns 










, . through 


fet 


at 


<429, 405) 


to 


Vdd 


after 


42* 


is dr i ven 


1 ow 


at 


4 1 .22ns 










. . through 


fet 


at 


<361, 397) 


to 


GND 


after 


55* 


is driven 


h igh 


at 


29.99ns 










. . through 


fet 


at 


<339, 405) 


to 


Vdd 


after 


41 * 


is driven 


1 ow 


at 


20.81ns 










. . through 


fet 


at 


<271, 397) 


to 


GND 


after 


54* 


is driven 


h i gh 


at 


9.40ns 










. .through 


fet 


at 


<249, 405) 


to 


Vdd 


after 



a Is driven low at 0.00ns 
[0:00. lu 0:00. Is 31k] 

: critical -g Bander. dum 
[0:00. lu 0:00. Is 36k] 

: quit 

[0:00. 4u 0:00.4s 36k] Crystal done. 
X 



Data Path Five Input AND Crystal Session 



237 



V f s ♦ 1 

push 541 397 2 2 
paint e 

label [8186. 0ns, fal 1 
push 519 405 2 2 
paint e 

label C 7 1 70 . 6ns , r 1 3© 
push 451 397 2 2 
paint e 

label [6161. 4ns, fa 11 
push 429 405 2 2 
paint e 

label C 5 1 50 . 4ns , r 1 se 
push 361 397 2 2 
paint e 

label [4141 .2ns, fall 
push 339 405 2 2 
paint e 

label [3130. 0ns, rise 
push 271 397 2 2 
paint e 

label C2320.8ns,fal 1 
push 249 405 2 2 
paint e 

1 abel [ 1 1 9 . 4ns , r 1 se 
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a> 



X powest -p <a5andcr.5fm 

gamma »0 . 4 V* * . 5 ♦ tox = 9e“08m, u0»0 . 08m* * 2 / V- s 

vdd=5V, vtd=-3.5V, vte=0.8V, vsb=2V 

#dev3 Pdc_avg <W) Pdc_max (W) type 



0.000000 

0.000940 

0.000000 



0.000000 enhancement puHups 

0.001879 depletion pullups 

0.000000 special depletion pullups 



8 0.000940 0.001879 TOTAL 

X 



Data Path Five Input AND Powest Analysis 



( ( { dest 1 nat ton z ) 

(source a ) 

( source b ) 

( source c ) 

{ source d ) 

( source e ) 

( 1 ogo f 1 veand ) 

( word- length 1 ) 

(ground 1 ) 

(signal a Input 5 > 

(signal b Input 6) 

(signal c input 7) 

(signal d input 8) 

(signal e input 9 ) 

(signal z output 10) 

{ ph 1 a 2 > 

(phib 3) 

( p h i c 4 ) 

( power 11)) 
n i 1 
n i 1 

{(( s i gna 1 -output z) (nor ((primitive (gate 10))))) 
((gate 10) 

( nor 



( (primitive 


(gate 


9 > ) 




( p r i m i t i ve 


(gate 


8 ) ) 




( p r i m i t i ve 


(gate 


7) ) 




( p r i m i t i ve 


( gate 


6 > ) 




( p r 1 m i t i ve 


(gate 


5 ) ) ) ) ) 




(gate 9 ) 








( nor 








( (primitive 


(gate 


4 ) ) 




( pr i m i t i ve 


( gate 


3 ) ) 




(primitive 


( gate 


2 > ) 




(primitive 


(gate 


1 ) ) 




( p r i m i t i ve 


( gate 


0) ) 




(primitive 


(signal-input 


a ) ) 


( p r 1 m i t i ve 


( s ignal-input 


b ) ) 


( p r i m i t i ve 


( s ignal-input 


c ) ) 


( p r i m i t i ve 


(signal-input 


d ) ) ) ) ) 


(gate 8 ) 








( nor 








((primitive 


(gate 


4 ) ) 




(primitive 


(gate 


3 ) ) 




(primitive 


(gate 


2 ) ) 




(primitive 


(gate 


1 ) ) 




( p r i m i t i ve 


(gate 


0 ) ) 




(primitive 


( s i g na 1 - i np ut 


a ) ) 


(primitive 


(signal-input 


b ) ) 


(primitive 


(signal-input 


c ) ) ) ) ) 


(gate 7 ) 








{ nor 








( ( p r i m i t i ve 


(gate 


4 ) ) 




( p r i m i 1 1 ve 


(gate 


3 ) ) 




(primitive 


(gate 


2 ) ) 




( p r i m i t i ve 


(gate 


1 ) ) 




(primitive 


(gate 


0) ) 
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{ pr 1 m 1 1 1 ve 
( pr t m 1 1 1 ve 
( ( gate 6 ) 

( nor 

( ( p r ! m 1 1 t ve 
(primitive 
{ pr I m 1 1 1 ve 
(primitive 
(primitive 
(primitive 
( ( gate 5 ) 

( nor 

((primitive 
( pr i m i t I ve 
( pr i m i t i ve 
(primitive 
(primitive 
( ( gate 4 ) ( nor 



( s i gna 1 - i nput a ) ) 
(signal-input b)))>) 



(gate 
( gate 
(gate 
(gate 
(gate 



4 ) ) 
3 > ) 
2 > ) 
1 > ) 
0 ) ) 



( (gate 
( (gate 
( (gate 
( (gate 
{ ( 4 
(3 
(2 
( 1 
( 1 I 
(9 
( 8 
(7 
( 6 
(5 
( 10 



3) 

2 ) 

1 ) 

0 ) 

( p h i c > ) 
(phib) ) 
(ph la ) ) 

( ground ) ) 
( power ) ) 
(input e 
(input d 
(input c 
(input b 
( input 



( nor 
( nor 
( nor 
(nor 



( s 
(s 
(s 
( s 

( 3 



( s Igna 1 - I nput a > ) ) ) ) 



(gate 4 ) ) 
(gate 3 ) ) 

( gate 2 ) ) 
(gate I ) ) 
(gate 0) ) ) ) ) 
( ( pr I m 1 1 1 ve 
((primitive 
((primitive 
( ( pr I m 1 1 i V e 
( { pr i m i t i ve 



( 5 I gna 1 - I nput 
(signal-input 
( s I g na 1 - i np u t 
( 3 I gna 1 - i nput 
( s i gna 1 - i nput 



i gna 1 - I nput 
i gna 1 - i nput 
i gna 1 - I nput 
i gna 1 - I nput 
I g na 1 - i npu t 



e) ) ) 
d > ) ) 
c ) > ) 
b ) ) ) 
a ) ) ) 



(outputs 2 ( s i gna 1 -output 2 ))))> 



a ) ) 


) ) ) 


b ) ) 


) ) ) 


c ) > 


> ) ) 


d > > 


) ) > 


e ) ) 


> ) ) ) 
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Script started on Mon Apr 15 22:29:07 1985 

X macpitts f 1 vea rd . he r a 1 d 
Statistic - for project flv-^and 

Statistic - options: (herald opt-d opt-c stat obj clf no logo) 



63, 
70. 
896 , 
898 , 
9B3 , 



Herald 
Hera 1 d 
Herald 
Herald 
Herald 
Her a 1 d 
Herald 
Herald 
Herald 
Herald 
Statistic - 
Statistic - 
Statistic - 



5 5 - Read In- j source file “ fiveand.nac 
55 - Read It. q library from - /vlsl/nacplt/llbrary 
604 - Frc* cessing -Jeflnltlons 
604 “ Evaluating ovals 
604 - Expanding macros 



1103. 701 “ Extracting 
1108, 701 “ Extracting 

1110. 701 - Extracting 

1110, 701 “ Extracting 

1111, 701 “ Extracting 

Maximum control depth 
Number of gates Is 12 
Data-path has 0 Units 



sour ces 
des t I na 1 1 ons 
labels 
sequencers 
flags, data-path. 
1 s 4 



control , and 



Herald - 1946, 1236 
Herald - 2002. 1286 
Statistic - Control 
Herald - 400 1 . 24 1 7 
Statistic - Circuit 
Statistic - Control 



Outputing .obJ file 

- Extruding gates 
has 17 c c* 1 u rn n s 

- Extruding straps 
has 136 transistors 
has 11 tracl-s 



Statistic - Pov>^er consumption Is 0.040723 Watts 



Herald - 4183. 2517 - 
Statistic - Oata-path 



Herald 
Hera 1 d 
Herald 
Herald 
Hera Id 
Hera 1 d 



41S6. 
4997 . 
4599 . 
5000' . 
5018 . 
505 4 . 



2517 - 
29 43 - 
2943 - 
2943 - 
294 3 - 
2943 - 



Laying out data-path 
internal bus uses 0 tracks 
Laying out control 
out 
out 
out 
ou t 
out 



Laying 
Laying 
Laying 
Laying 
Laying 

Statistic “ Dimensions are 1 
Herald - 7361, 4042 - Outputing 

Statistic - Memory used - 349K 
Statistic “ Compilation took 2.106111 
Statistic - Garbage collection took 1 
Statistic - For a total of 41 garbage 
X '"D 

script done on Mon Apr 15 22:34:42 1985 



flags 
r Ivor 
w I ng 
skeleton 
pins 
772500 mm by 
clf file 



1 . 905000 mm 



CPU ml nutes 
153889 CPU minutes 
col lect Ions 
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94 


Vdd 


41000 46000 


; 94 


GND 64200 54900 


9 4 


GND 59000 66700 


94 


Vdd 


45200 47700 


; 94 


GND 69500 54900 


; 94 


GND 63000 66700 


94 


Vdd 


48200 47700 


; 94 


GND 80000 54900 


; 94 


GND 68200 66700 


94 


Vdd 


50400 47700 


; 94 


GND 46700 54900 


; 94 


GND 78700 66700 


94 


Vdd 


53400 47700 


9 4 


GND 52000 54900 


; 9 4 


GND 74700 66900 


94 


Vdd 


55700 47700 


; 94 


GND 57200 54900 


; 9 4 


GND 46700 66900 


94 


Vdd 


58700 47700 


9 4 


GND 64200 54900 


; 94 


GND 57200 66900 


94 


Vdd 


62700 47700 


9 4 


GND 69500 54900 


; 94 


GND 64200 66900 


94 


Vdd 


65700 47700 


; 94 


GND 80000 54900 


; 94 


GND 69500 66900 


94 


Vdd 


67900 47700 


; 94 


15 48500 55900; 


94 


GND 74700 66900 


94 


Vdd 


73200 47700 


; 94 


25 81700 55900; 


94 


GND 80000 66900 


94 


Vdd 


78400 47700 


; 94 


z 53700 56700; 


94 


© 76500 67900; 




94 


Vdd 


81400 47700 


; 94 


18 56000 56700; 


94 


19 59000 67900; 


94 


14 


45000 48900 


94 


19 59000 56700; 


94 


e 76500 67900; 




94 


15 


48000 48900 


94 


20 63000 56700; 


94 


15 48500 68700; 


94 


16 


50200 48900 


94 


22 68200 56700; 


94 


20 63000 68700; 


94 


2 53200 48900; 


94 


24 78700 56700; 


94 


22 68200 68700; 


94 


1 8 


55500 48900 


94 


16 50200 57900; 


94 


23 73500 68700; 


94 


19 


58500 48900 


94 


GND 56000 58700 


; 94 


24 78700 68700; 


94 


20 


62500 48900 


94 


GND 59000 58700 


; 94 


21 66000 69000; 


94 


21 


65500 48900 


94 


GND 63000 58700 


; 94 


c 60700 69900; 




94 


22 


67700 48900 


94 


GND 68200 58700 


; 94 


18 55500 69900; 


94 


23 


73000 48900 


94 


GND 78700 58700 


; 94 


c 60700 69900; 




94 


24 


78200 48900 


94 


GND 52000 58900 


; 94 


GND 48500 70700 


94 


25 


81200 48900 


94 


GND 57200 58900 


; 94 


GND 63000 70700 


94 


14 


45500 51200 


94 


GND 64200 58900 


; 94 


GND 66000 70700 


94 


15 


48500 51200 


94 


GND 69500 58900 


; 94 


GND 78700 70700 


94 


16 


50700 51200 


94 


GND 80000 58900 


; 94 


GND 46700 70900 


94 


2 53700 51200; 


94 


a 41500 59900; 


94 


GND 64200 70300 


94 


18 


56000 51200 


94 


a 41500 59900; 


94 


GND 80000 70900 


94 


19 


59000 51200 


94 


23 73000 59900; 


94 


24 78200 71900; 


94 


20 


63000 51200 


94 


16 50700 60700; 


94 


15 48500 72700; 


9 4 


21 


66000 51200 


94 


18 56000 60700; 


94 


20 62500 73900; 


94 


22 


68200 51200 


94 


19 59000 60700; 


94 


GND 48500 74700 


94 


23 


73500 51200 


94 


20 63000 60700; 


94 


GND 46700 74900 


94 


24 


78700 51200 


94 


22 68200 60700; 


94 


a 41500 75900 




94 


25 


81700 51200 


94 


24 78700 60700; 


94 


b 43200 75900 




94 


14 


45000 52900 


94 


14 45000 61900; 


94 


z 53700 75900 




94 


15 


48500 52900 


94 


GND 56000 62700 


; 94 


c 60700 75900 




94 


16 


50200 52900 


94 


GND 59000 62700 


; 94 


d 71200 75900 




94 


2 53700 52900; 


94 


GND 63000 62700 


; 94 


e 76500 75900 




94 


18 


55500 52900 


94 


GND 68200 62700 








94 


19 


59000 52900 


94 


GND 78700 62700 








94 


20 


62500 52900 


94 


GND 46700 62900 








94 


21 


66000 52900 


94 


GND 57200 62900 








94 


22 


67700 52900 


94 


GND 64200 62900 








94 


23 


73000 52900 


94 


GND 69500 62900 








94 


24 


78200 52900 


94 


GND 80000 62900 








94 


25 


81700 52900 


94 


b 43200 63900; 








94 


d 71200 53900; 


94 


b 43200 63900; 








94 


22 


67700 53900 


94 


14 45500 64700; 








94 


d 71200 53900; 


94 


18 56000 64700; 








94 


GND 48500 54700; 94 


20 63000 64700; 








94 


GND 78700 54700; 94 


22 68200 64700; 








94 


GND 81700 54700; 94 


24 78700 64700; 








94 


GND 46700 54900; 94 


19 59000 65000; 








94 


GND 52000 54900; 94 


21 66000 65900; 








94 


GND 57200 54900; 94 


GND 56000 66700; 
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File 



APPENDIX B 



CHAPTER IV LISTINGS 



Statistic - for project gc 

Statistic - optfonsi (herald opt-d opt-c stat obj clf nologo) 

Herald - 64, 57 - Reading source file - gc.mac 

Herald - 69* 57 - Reading library from - /v 1 s 1 /macp 1 t/ 1 Ibrary 

Herald - 911, 622 - Processing definitions 

Herald - 912, 622 - Evaluating evals 

Herald - 996, 622 - Expanding macros 

Herald - 1 j 009, 622 - Extracting sources 

Herald - 1012, 622 - Extracting destinations 

Herald - 1108, 716 - Extracting labels 

Herald - 1108, 716 - Extracting sequencers 

Herald - 1110, 716 - Extracting flags, data-path, control, and pins 

Statistic - Maximum control depth Is 4 

Statistic - Number of gates Is 26 

Statistic - Data-path has 7 Units 

Herald - 2625, 1722 - Outputlng .obJ file 

Herald - 2716, 1722 - Extruding gates 

Statistic - Control has 31 columns 

Herald - 8491, 4785 - Extruding straps 

Statistic - Circuit has 280 transistors 

Statistic - Control has 12 tracks 

Statistic - Power consumption Is 0.055910 Watts 

Herald - 8910, 4993 - Laying out data-path 

Herald - 9070, 5099 - Organelle unit# 1 bit 1 

Herald - 9263, 5207 - Organelle unit# 1 bit 0 

Herald - 9318, 5207 - Organelle unit# 2 bit 1 

Herald - 9549, 5313 - Organelle unit# 2 bit 0 

Herald - 9636, 5313 - Organelle unit# 3 bit 1 

Herald - 9784, 5421 - Organelle unit# 3 bit 0 

Herald - 9846, 5421 - Organelle unit# 4 bit 1 

Herald - 10274, 5652 - Organelle unit# 4 bit 0 

Herald - 10470, 5765 - Organelle unit# 5 bit 1 

Herald - 10509, 5765 - Organelle unit# 5 bit 0 

Herald - 10578, 5765 - Organelle unit# 6 bit 1 

Herald - 10801, 5876 - Organelle unit# 6 bit 0 

Herald - 10997, 5989 - Organelle unit# 7 bit 1 

Herald - 11014, 5989 - Organelle unit# 7 bit 0 

Statistic - Data-path Internal bus uses 3 tracks 

Herald - 11096, 5989 - Laying out control 

Herald - 13020, 6925 - Laying out flags 

Herald - 13023, 6925 - Laying out river 

Herald - 13168, 7041 - Laying out wing 

Herald - 13177, 7041 - Laying out skeleton 

Herald - 13262, 7041 - Laying out pins 

Statistic - Dimensions are 2.587500 mm by 1.982500 mm 
Herald - 15882, 8254 - Outputlng ,clf file 
Statistic - Memory used - 403K 

Statistic - Compilation took 4.487778 CPU minutes 
Statistic - Garbage collection took 2.328889 CPU minutes 
Statistic - For a total of 79 garbage collections 
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Statfstfc - for project gc2 

StatfstYc - optfonsi (herald opt-d opt-c stat obj cff nologo) 

Herald - 61, 54 - Reading source file - gcZ.mac 

Herald - 64, 54 - Reading 1 Ibrary from - / v 1 s 1 /macp 1 t/ 1 1 br ar y 

Herald - 882, 596 - Processing definitions 

Herald - 884, 596 - Evaluating evals 

Herald - 967, 596 - Expanding macros 

Herald - 986, 596 - Extracting sources 

Herald - 1084, 692 - Extracting destinations 

Herald - 1086, 692 - Extracting labels 

Herald - 1087, 692 - Extracting sequencers 

Herald - 1090, 692 - Extracting flags, data-path, control, and 
Statistic - Maximum control depth Is 4 
Statistic - Number of gates Is 27 
Statistic - Data-path has 8 Units ^ 

Herald - 2661, 1695 - Outputlng ,obJ file 

Herald - 2766, 1695 - Extruding gates 

Statistic - Control has 32 columns 



Herald - 9213, 5045 - Extruding straps 
Statistic - Circuit has 288 transistors 
Statistic - Control has 13 tracks 



Statistic - Power consumption Is 0.057477 Watts 



Hera 1 d 


- 


9651 , 5249 - 




Laying 


out 


data-path 




Hera 1 d 


- 


9822, 5356 - 




Or ganel 


1 e unit# 1 


bit 1 




Hera 1 d 


- 


10022, 5464 


- 


Or gane 


1 1 e 


un 1 1# 


1 


bit 


0 


Hera 1 d 


- 


10072, 5464 


- 


Or gane 


lie 


unit# 


2 


bit 


1 


Herald 


- 


10114, 5464 


- 


Or gane 


1 1e 


unit# 


2 


bit 


0 


Her a 1 d 


- 


10270, 5571 


- 


Or gane 


1 1e 


un It# 


3 


bit 


1 


Hera 1 d 


- 


10503, 5684 


- 


Or g ane 


1 le 


unit# 


3 


bit 


0 


Hera 1 d 


- 


10585, 5634 


- 


Or gane 


1 1 e 


unit# 


4 


b It 


1 


Herald 


- 


10718, 5792 


- 


Or gane 


lie 


unit# 


4 


b ft 


0 


Hera 1 d 


- 


10755, 5792 


- 


Or gane 


1 le 


un 1 1# 


5 


bit 


1 


Hera 1 d 


- 


11169, 6017 




Or gane 


1 1 e 


un 1 1# 


5 


bit 


0 


Herald 


- 


11254, 6017 


- 


Or gane 


1 le 


unit# 


6 


bit 


1 


Hera 1 d 


- 


11422, 6128 


- 


Or gane 


1 le 


un It# 


6 


bit 


0 


Herald 


- 


11494, 6128 


- 


Or ga ne 


1 1 e 


unit# 


7 


bit 


1 


Hera 1 d 


- 


11723, 6241 


- 


Or gane 


1 1 e 


unit# 


7 


bit 


0 


Herald 


- 


11916, 6353 


- 


Or gane 


lie 


unit# 


8 


bit 


1 


Hera 1 d 


- 


11936, 6353 


- 


Or gane 


1 1 e 


unit# 


8 


b It 


0 


Statistic 


: - Data-path 




1 nter na 


1 bus uses 




3 tr acks 


Hera 1 d 


- 


12034, 6353 


- 


Laying 


out 


control 




Herald 


- 


14219, 7417 


- 


Laying 


out 


f lags 








Hera 1 d 


- 


14224, 7417 


- 


Laying 


out 


r 1 ver 








Hera 1 d 


- 


14374, 7534 


- 


Laying 


out 


wing 








Herald 


- 


14383, 7534 


- 


Laying 


out 


skeleton 




Hera 1 d 


- 


14478, 7534 


- 


Laying 


out 


pins 









Statistic - Dimensions are 2.687500 mm by 1.982500 mm 
Herald - 17205, 8788 - Outputlng .clf file 



Stat 1 st 1 c 
Statistic 
Statistic 
Stat 1 st 1 c 



Memory used - 408K 

Compilation took 4.823334 CPU minutes 
Garbage collection took 2.441111 CPU minutes 
For a total of 83 garbage collections 



pins 
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^ Statfsttc • for project stop 

Statistic - options: (herald opt-d opt-c stat obj clf nologo) 

Herald - 63, 56 - Reading source file - stop. mac 

Herald - 74, 56 - Reading 1 Ibrary from - /v 1 s 1 /macp 1 1/ 1 Ibrary 

Herald - 877, 588 - Processing definitions 

Herald - 878, 588 - Evaluating evals 

Herald - 961, 588 - Expanding macros 

Herald - 1088, 681 - Extracting sources 

Herald - 1094, 681 - Extracting destinations 

Herald - 1102, 681 - Extracting labels 

Herald - 1102, 681 - Extracting sequencers 

Herald - 1107, 681 - Extracting flags, data-path, control, and pins 

Statistic - Maximum control depth Is 5 

Statistic - Number of gates Is 37 -- 

Statistic - Data-path has 3 Units 

Herald - 2983, 1885 - Outputing .obJ file 

Herald - 3104, 1885 - Extruding gates 

Statistic - Control has 43 columns 

Herald - 17705, 9477 - Extruding straps 

Statistic - Circuit has 268 transistors 

Statistic - Control has 14 tracks 

Statistic - Power consumption Is 0.054698 Watts 

Herald - 18256, 9790 - Laying out data-path 

Herald - 18279, 9790 - Organelle unit# 1 bit 1 

Herald - 18773, 10113 - Organelle unit# 1 bit 0 

Herald - 18830, 10113 - Organelle unit# 2 bit 1 

Herald - 19001, 10220 - Organelle unit# 2 bit 0 

Herald - 19075, 10220 - Organelle unit# 3 bit 1 

Herald - 19091, 10220 - Organelle unit# 3 bit 0 

Statistic - Data-path Internal bus uses 2 tracks 

Herald - 19244, 10327 - Laying out control 

Herald - 21284, 11356 - Laying out flags 

Herald - 21286, 11356 - Laying out river 

Herald - 21307, 11356 - Laying out wing 

Herald - 21333, 11356 - Laying out skeleton 

Herald - 21382, 11356 - Laying out pins 

Statistic - Dimensions are 2.107500 mm by 2.207500 mm 

Herald - 24464, 12791 - Outputing .clf file 

Statistic - Memory used - 403K 

Statistic - Compilation took 6.877223 CPU minutes 
Statistic - Garbage collection took 3.587222 CPU minutes 
Statistic - For a total of 123 garbage collections 
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Statlsttc - for project b5 

* ' Statistic - options: (herald opt-d opt-c stat ob J cif nologo) 

Herald - 65, 53 - Reading source file - b4.mac 

Herald - 74, 53 - Reading library from - / v 1 s t /macp i t / 1 1 br ar y 

Herald - 898, 596 - Processing definitions 

Herald - 899, 596 - Evaluating evals 

Herald - 980, 596 - Expanding macros 

Herald - 1106, 686 - Extracting sources 

Herald - 1113, 686 - Extracting destinations 

Herald - 1118, 686 - Extracting labels 

Herald - 1118, 686 - Extracting sequencers 

Herald - 1121, 686 - Extracting flags, data-path, control, and pins 

Statistic - Maximum control depth Is 4 
Statistic - Number of gates Is 53 ^ 

Statistic - Data-path has 10 Units 

Herald - 4032, 2550 - Outputing ,obJ file 

Herald - 4243, 2550 - Extruding gates 

Statistic - Control has 63 columns 

Herafd - 25458, 12382 - Extruding straps 

Statistic - Circuit has 1208 transistors 

Statistic - Control has 27 tracks 

Statistic - Power consumption is 0,201805 Watts 



Hera Id 


- 


26808, 


13048 


- 


Laying out 


data- 


■path 




Her a 1 d 


- 


27264, 


13272 


- 


Or gane 1 1 e 


un i t# 


1 


bit 


4 


Hera 1 d 


- 


27788 , 


13612 


- 


Organel 1 e 


un 1 1# 


1 


bit 


3 


Herald 


- 


27815, 


13612 


- 


Organel le 


un 1 1# 


1 


b 1 1 


2 


Her a 1 d 


- 


27841 , 


13612 


- 


Organel 1 e 


unit# 


1 


bit 


1 


Her a 1 d 


- 


27983, 


1 3727 


- 


Organel 1 e 


un 1 1# 


1 


bit 


0 


Hera 1 d 


- 


28111, 


13727 


- 


Organel 1 e 


un 1 1# 


2 


bit 


4 


Hera 1 d 


- 


28292 , 


13845 


- 


Organel 1 e 


unit# 


2 


bit 


3 


Hera Id 


- 


28320, 


1 3845 


- 


Organel 1 e 


unit# 


2 


bit 


2 


Hera 1 d 


- 


28349, 


13845 


- 


Organel le 


un 1 1# 


2 


bit 


1 


Herald 


- 


28499, 


13965 


- 


Organel le 


un It# 


2 


bit 


0 


Hera 1 d 


- 


28634, 


13965 


- 


Organel le 


un 1 1# 


3 


bit 


4 


Herald 


- 


28886, 


1 4082 


- 


Organel 1 e 


unit# 


3 


bit 


3 


Hera 1 d 


- 


28920, 


14082 


- 


Organel le 


unit# 


3 


bit 


2 


Hera 1 d 


- 


29186 , 


14313 


- 


Organel le 


un 1 1# 


3 


bit 


1 


Hera 1 d 


- 


29220, 


14313 


- 


Organel 1 e 


unit# 


3 


bit 


0 


Hera 1 d 


- 


29360, 


14313 


- 


Organelle 


unit# 


4 


bit 


4 


Hera 1 d 


- 


29497, 


1 4430 


- 


Organel lo 


unit# 


4 


bit 


3 


Hera 1 d 


- 


29509 , 


1 4430 


- 


Organel 1 e 


unit# 


4 


b 1 1 


2 


Hera 1 d 


- 


29521 , 


14430 


- 


Organel 1 e 


unit# 


4 


bit 


1 


Herald 


- 


29532, 


1 4430 


- 


Organel le 


unit# 


4 


bit 


0 


Her aid 


- 


29602, 


1 4430 


- 


Organel 1 o 


un 1 1# 


5 


bit 


4 


Herald 


- 


30093, 


1 467 1 


- 


Organel 1 e 


unit# 


5 


bit 


3 


Herald 


- 


30290, 


1 4794 


- 


Organel 1 e 


unit# 


5 


b 1 1 


2 


Hera Id 


- 


30358, 


1 4794 


- 


Organel le 


unit# 


5 


bit 


1 


Hera 1 d 


- 


30551 , 


14919 




Organel 1 e 


un 1 1# 


5 


b 1 1 


0 


Hera 1 d 


- 


31072, 


15171 


- 


Organel 1 e 


unit# 


6 


b 1 1 


4 


Herald 


- 


31346, 


15296 


- 


Organel 1 e 


un 1 1# 


6 


bit 


3 


Herald 


- 


31388, 


15296 


- 


Organel 1 e 


unit# 


6 


bit 


2 


Her a 1 d 


- 


31431 , 


15296 


- 


Organel 1 e 


un 1 1# 


6 


b It 


1 


Hera Id 


- 


31599, 


15421 


- 


Organel le 


unit# 


6 


bit 


0 


Herald 


- 


31766, 


1 5421 


- 


Organel 1 e 


unit# 


7 


bit 


4 


Hera 1 d 


- 


31972, 


15545 


- 


Organel 1 e 


unit# 


7 


bit 


3 


Hera 1 d 


- 


32001 , 


15545 


- 


Organel le 


unit# 


7 


b It 


2 


Hera 1 d 


- 


32031 , 


15545 


- 


Organel le 


unit# 


7 


bit 


1 


Hera 1 d 


- 


32188 , 


1567 1 


- 


Organel 1 e 


unit# 


7 


bit 


0 
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Herald 


- 


32331 , 


15671 


- 


Or ganel 1 e 


un i t# 


8 


bit 4 


Hera 1 d 


- 


32342, 


15671 


- 


Organe 1 1 e 


unit# 


8 


bit 3 


Herald 


- 


32354, 


15671 


- 


Organe 1 1 e 


unit# 


8 


bit 2 


Hera 1 d 


- 


32493, 


15800 


- 


Organe 1 1 e 


unit# 


8 


bit 1 


Hera 1 d 


- 


32505, 


15800 


- 


Or ga ne 1 1 e 


un i t# 


8 


b 1 1 0 


Hera 1 d 


- 


32560, 


15800 


- 


Or gane 1 1 e 


unit# 


9 


bit 4 


Hera 1 d 


- 


32916, 


15930 


- 


Or ga ne 1 1 e 


unit# 


9 


bit 3 


Herald 


- 


33125, 


16060 


- 


Orga ne 1 1 e 


unit# 


9 


bit 2 


Hera 1 d 


- 


33341 , 


16196 


- 


Orga ne 1 1 e 


unit# 


9 


bit 1 


Hera 1 d 


- 


33422, 


16196 


- 


Or gane 1 1 o 


un i t# 


9 


b i t 0 


Hera 1 d 


- 


33983, 


16459 


- 


Or gane 1 1 e 


unit# 


10 


bit 4 


Hera 1 d 


- 


34082, 


16459 


- 


Or gane 1 1 e 


unit# 


10 


bit 3 


Hera 1 d 


- 


34297 , 


16590 


- 


Organe 1 1 e 


unit# 


10 


bit 2 


Hera 1 d 


- 


34515, 


1 6722 


- 


Organe 1 1 e 


unit# 


10 


bit 1 


Hera 1 d 


- 


34601 , 


16722 


- 


Organe 1 1 e 


unit# 


10 


bit 0 


Statistic - Data 


-path 


internal bus 


uses 


5 


tracks 


Her a 1 d 


- 


35348, 


16992 


- 


Laying out 


control 




Hera 1 d 


- 


41246 , 


19921 


- 


Laying out 


flags 






Hera 1 d 


- 


41742, 


20059 


- 


Laying out 


r i ver 






Hera 1 d 


- 


41993, 


20197 


- 


Laying out 


w i ng 






Hera 1 d 


- 


42015, 


20197 


- 


Laying out 


ske 1 eton 


Hera 1 d 


- 


42180, 


20197 


- 


Laying out 


pins 







Statistic Dimensions are S, 770020 mm by 3.125000 mm 
Herald - 49229, 23494 - Outputing .cif file 
Statistic - Memory used - 518K 

Statistic - Compilation took 13.804167 CPU minutes 
Statistic - Garbage collection took 6.569723 CPU minutes 
Statistic - For a total of 199 garbage collections 
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;GRAY CODE to BINARY conversion algorithm 
(program gc2s 2 
(def 1 ground) 

( def 2 ph i a ) 

( def 3 ph 1 b ) 

{ def 4 ph ! c ) 

(def reset signal input 5) 

(def inp signal input 6 ) 

(def bln signal output 7) 

( def 8 power ) 

(process grycod 0 
msbs 



( cond ( ( not 
( i np 

comp 1 

( cond ( ( not 
( 1 np 



nextblt 

( cond ( ( no t 
(inp 



i np ) ( setq 
( setq 

i np ) ( setq 
( setq 

i np ) ( setq 
(setq 



bin (not inp))(go msbs)) 
bln inp)(go comp 1 ) ) ) 

bln i np ) ( go compl)) 

bln (net inp))(go nextblt))) 

b1n(not inp))(go nextblt)) 
bln Inp)(go compl))) ) ) 



THIS ALGORITHM EXHIBITS THE GRAY CODE 
DECODING SCHEME DONE IN THE CONTROL PATH. 

THE ONLY DATA PATH ORGANELLES INSTANTIATED 
ARE THOSE ASSOCIATED WITH THE SEQUENCER. THE 
WIDTH OF THE SEQURNCER (2 BITS) IS DEFINED 
EXPLICITLY IN THE PROGRAM STATEMENT. EVEN 
THOUGH NO ACTUAL DATA PATH (AS SUCH) EXISTS. 
THE IMPLICATION IS THAT FSMs CAN BE CREATED 
WITHOUT AN "ACTUAL DATA PATH". 
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Statistic - for project gcs 

Statistic - options: (herald opt-d opt-c stat obj clf nologo) 

Herald - 65, 55 - Reading source file - gcs. mac 

Herald - 70, 55 - Reading library from - / v 1 s 1 /macp 1 1 / 1 1 b r ar y 

Herald - 887, 598 - Processing definitions 

Herald - 889, 598 ~ Evaluating evals 

Herald - 975, 598 - Expanding macros 

Herald - 995, 598 - Extracting sources 

Herald - 1094, 692 - Extracting destinations 

Herald - 1095, 692 - Extracting labels 

Herald - 1095, 692 - Extracting sequencers 

Herald - 1098, 692 - Extracting flags, data-path, control, and pins 

Statistic - Maximum control depth Is 4 
Statistic - Number of gates Is 25 ^ 

Statistic - Data-path has 4 Units 
Herald - 2138, 1378 - Outputing ,obJ file 

Herald “ 2214, 1378 - Extruding gates 

Statistic “ Control has 29 columns 
Herald - 8365, 4632 - Extruding straps 
Statistic - Circuit has 215 transistors 
Statistic - Control has 13 tracks 
Statistic - Power consumption Is 0.041979 Watts 
Herald - 8769, 4850 - Laying out data-path 
Herald - 8803, 4850 - Organelle unit# 1 bit 1 

Herald - 9319, 5181 - Organelle unit# 1 bit 0 

Herald - 9397, 5181 - Organelle unit# 2 bit 1 

Herald - 9564, 52 9-6 - Organelle unit# 2 bit 0 

Herald - 9635, 5296 - Organelle unit# 3 bit 1 

Herald - 9891, 5407 - Organelle unit# 3 bit 0 

Herald - 10083, 5518 - Organelle unit# 4 bit 1 

Herald - 10101, 5518 - Organelle unit# 4 bit 0 

Statistic - Data-path Internal bus uses 3 tracks 
Herald - 10155, 5518 - Laying out control 

Herald - 11868, 6353 - Laying out flags 

Herald - 11871, 6353 - Laying out river 

Herald - 12011, 6469 - Laying out wing 

Herald - 12024, 6469 - Laying out skeleton 

Herald - 12093, 6469 - Laying out pins 

Statistic - Dimensions are 1.742500 mm by 1.942500 mm 
Herald - 14192, 7428 - Outputing .clf file 
Statistic - Memory used - 377< 

Statistic - Compilation took 4.008611 CPU minutes 
Statistic - Garbage collection took 2.098333 CPU minutes 
Statistic - For a total of 71 garbage collections 



Gcs. scr 



;DPLC2.MAC 
(program dplc2 5 
(def 13 power) 

(def 1 ground) 

( def 2 ph t a ) 

( def 3 ph 1 b ) 

( def 4 ph 1 c ) 

(def c signal Input 5) 

(def tl signal Input 6) 

(def ts signal Input 7) 

(def reset signal Input 14) 



;there are 5 outputs 



;note use of Boolean Inputs 



(def Ic port output ( 8 9 10 11 12>) ;and Integer outputs 

0 



(process 1 Ight^cont ro 1 1 er 
hg • 

( cond ( ( not ( and c tl ) 



hy 



fg 



fy 



stipulates FSM architecture 
HIGHWAY GREEN state 
) ;1f TRUE, set these outputs, 

( setq 1 c 4 ) 

(go hg ) ) 



( t 


(setq Ic 


5) 








( go 


hy) ) ) 




; HIGHWAY ' 


/ELLOW 


stale 


( cond ( ( not ts ) 










(setq Ic 


12) 








( go 


hy ) ) 


(t 


(setq Ic 


13) 








( go 


fg) ) ) 




tFARMROAD 


GREEN 


state 


( cond ( ( not ( or tl(not 


c) ) ) 








(setq Ic 


16) 








( go 


fg) ) 


( t 


( setq 1 c 


17 ) 








( go 


fy ) ) ) 




;FARMROAD 


YELLOW state 


( cond ( ( not ts ) 










( setq 1 c 


18 ) 








( go 


fy ) ) 


( t 


(setq Ic 


19 ) 








( go 


hg ) ) ) 



) ) ) ) ) 



Dp 1 c2. mac 




Dplc2. 



ci -f 






Stattst1c‘- for project dplc2 

Statistic - optlonst (herald opt-d opt-c stat obj clf nologo) 

Herald - 62, 55 - Reading source file - dplcZ.mac 

Herald - 68, 55 - Reading library from - / v 1 s 1 /macp 1 t/ 1 1 br a r y 

Herald - 905, 604 - Processing definitions 

Herald - 906, 604 - Evaluating evals 

Herald - 989, 604 - Expanding macros 

Herald - 1107, 702 - Extracting sources 

Herald - 1111, 702 - Extracting destinations 

Herald - 1114, 702 - Extracting labels 

Herald - 1114, 702 - Extracting sequencers 

Herald - 1117, 702 - Extracting flags, data-path, control, and 

Statistic - Maximum control depth Is 5 

Statistic - Number of gates is 34 

Statistic - Data-path has 4 Units 

Herald - 2277, 1498 - Outputing .obJ file 

Herald - 2410, 1498 - Extruding gates 

Statistic - Control has 40 columns 

Herald - 8931, 4725 - Extruding straps 

Statistic - Circuit has 346 transistors 



Statistic - Control has 17 tracks 



Statistic - Power consumption is 0,056716 Watts 



Her a 1 d 


- 


9580, 5048 - 




Laying out 


data-path 




Hera Id 


- 


9922, 5267 - 




Or ga ne 1 1 e unit#* 1 


btt 4 




Hera 1 d 


- 


10156, 5379 


- 


Or gane 1 1 e 


un it# 


1 


bit 


3 


Her a 1 d 


- 


10207, 5379 


- 


Or ga ne 1 1 e 


un 1 1# 


1 


bit 


2 


Hera Id 


- 


10375, 5498 


- 


Or ga ne 1 1 e 


unit# 


1 


b 1 1 


1 


Herald 


- 


10533, 5607 


- 


Organe 1 1 e 


unit# 


1 


bit 


0 


Her a 1 d 


- 


10859, 5718 


- 


Organe 1 1 e 


un 1 1# 


2 


bit 


4 


Hera 1 d 


- 


11242, 5928 


- 


Or ga ne 1 1 e 


unit# 


2 


bit 


3 


Her a 1 d 


- 


11266, 5928 


- 


Or gane 1 1 e 


un 1 1# 


2 


b it 


2 


Hera 1 d 


- 


11291, 5928 


- 


Organe lie 


un i t# 


2 


bit 


1 


Hera 1 d 


- 


11316, 5928 


- 


Or gane 1 1 e 


unit# 


2 


bit 


0 


Herald 


- 


11552, 6042 


- 


Or ga ne 1 1 e 


un 1 1# 


3 


bit 


4 


Hera 1 d 


- 


11590, 6042 


- 


Or ga ne 1 1 e 


unit# 


3 


bit 


3 


Herald 


- 


11722, 6148 


- 


Or gane 1 1 e 


un It# 


3 


bit 


2 


Herald 


- 


11748, 6148 


- 


Or gane 1 1 e 


unit# 


3 


b it 


1 


Hera 1 d 


- 


11777, 6148 


- 


Or gane 1 1 e 


unit# 


3 


bit 


0 


Her a 1 d 


- 


12052, 6272 


- 


Organe 1 1 e 


un i t# 


4 


bit 


4 


Herald 


- 


12068, 6272 


- 


Or gane 1 1 e 


unit# 


4 


bit 


3 


Hera 1 d 


- 


12080, 6272 


- 


Or g a ne 1 1 e 


un i t# 


4 


b i t 


2 


Hera 1 d 


- 


12204, 6383 


- 


Or ga ne 1 1 e 


unit# 


4 


b i t 


1 


Hera 1 d 


- 


12216, 6383 


- 


Organe 1 1 e 


unit# 


4 


bit 


0 



Statistic - Data-path Internal bus uses 2 tracks 
Herald - 12313, 6383 - Laying out control 

Herald - 14457, 7438 - Laying out flags 

Herald - 14461, 7438 - Laying out river 

Herald - 14506, 7438 - Laying out wing 

Herald - 14521, 7438 - Laying out skeleton 

Herald - 14578, 7438 - Laying out pins 



Statistic 

Herald - 

Statistic 

Statistic 

Statistic 

Statistic 



- Dimensions are 2.160000 mm by 2.460000 mm 
18275, 9184 - Outputing .clf file 

- Memory used - 414K 

- Compilation took 5.164444 CPU minutes 

- Garbage collection took 2.586667 CPU minutes 

- For a total of 86 garbage collections 



pins 
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APPENDIX C 



CHAPTER V LISTINGS 



Scrtpt started on Sat Jun 15 15tl4t27 1985 
X /v 1 s 1 /berk85/b i n/crysta 1 splacis.slm 
Cr ysta 1 , v . 2 
tbufldsplacls.sfm 
[0:00. 5u 0:00.2s 31k] 

: tnpits c tl ts phia phlb 

Unknown command: inpits 

I inputs c tl ts phia phib 
[0100. 0u 0:00. Is 40k] 
j outputs st hl0 hll f10 fll 
[0:00. 0u 0:00.0s 40k] 

: delay phia 0 * 1 
Marking transistor flow... 

Setting Vdd to 1... 

Setting GND to 0... 

(198 stages examined.) 

[0:00. 5u 0:00. Is 47k] 

: critical 

Node hl0 is driven high at 26.93ns 

...through fet at (154, -155) to Vdd after 
50 is driven low at 23.99ns 

...through fet at (158, -106) to 93 
...through fet at (156, -59) to GND after 
156 is driven high at 18.05ns 

...through fet at (5, -61) to Vdd after 
73 is driven low at 9.33ns 

...through fet at (69, -113) to GND after 
41 is driven high at 6.31ns 

...through fet at (75, -124) to Vdd after 
27 is driven low at 1.95ns 

...through fet at (76, -153) to 4 
...through fet at (119, -126) to GND after 
phia is driven high at 0.00ns 
[0:00. lu 0:00.0s 47k] 

: critical -g splaphia 
[0:00. 0u 0:00. Is 52k] 

: clear 

[0100. 0u 0:00.0s 52k] 

: delay phib 0 - 1 
Marking transistor flow... 

Setting Vdd to 1... 

Setting GND to 0. . . 

(126 stages examined.) 

[0:00. 3u 0:00.0s 52k] 

: critical 

Node hl0 is driven high at 32.06ns 

...through fet at (154, -155) to Vdd after 
50 is driven low at 29.11ns 

...through fet at (158, -106) to 93 
...through fet at (156, -59) to GND after 
156 is driven high at 23.17ns 

...through fet at (5, -61) to Vdd after 
73 is driven low at 14.46ns 

...through fet at (69, -113) to GND after 
41 is driven high at 11.43ns 

...through fet at (75, -124) to Vdd after 
27 is driven 1 ow at 6.97ns 

...through fet at (76, -153) to 4 
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. . • through 


f et 


at 


(119, 


-126) 


to 


GND after 


59 Is driven 


h 1 gh 


at 


2.67ns 






. . . through 


f et 


at 


(118, 


-106) 


to 


88 


. . . through 


f et 


at 


(117, 


1 1 > to 


Vdd after 



phib Is driven high at 0,00ns 
I0t00,lu 0:00, Is 52k3 
i critical -g splaphlb 
Z0i00 , lu 0:00. Is 52k3 
: quit 

C0?01.7u 0:00.5s 52k3 Crystal done. 
X ""D 



script done on Sat Oun 15 15:16:58 1985 
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Script started on Sat Jun 15 15:18:00 1985 
X /v 1 3 1 /berk85/b t n/cr ysta 1 It.sfm 
Crysta 1 , v . 2 
I build 1 1 . 3 1 m 
[0:00. 8u 0100.2s 39k] 

: Inputs phia phib c tl ts 

[0:00. 0u 0:00. Is 48k3 
: outputs st fl0 f1l hl0 hll 



[0100. 0u 0:00.0s 48k] 

1 delay phia 0 -1 
Marking transistor flow... 

Sett 1 ng Vdd to 1 . . . 

Setting GND to 0. . . 

(21 stages examined.) 

[0:00. 7u 0:00. Is 50k] 

1 critical 

Node 228 Is driven low at 10.16ns 



, . 


. thr oug h 


fet 


at 


(569, 453) 


to 


262 




• . 


.through 


fet 


at 


(568, 570) 


to 


88 




, , 


. through 


fet 


at 


(456, 538) 


to 


41 1 




. • 


. through 


fet 


at 


(480, 537) 


to 


GND 


after 


260 


Is dr 1 ven 


high at 4.92ns 








, , 


. through 


fet 


at 


(416, 930) 


to 


Vdd 


after 


533 


Is driven 


1 ow at 


0.75ns 








• • 


. through 


fet 


at 


(365, 942) 


to 


GND 


after 



phia Is driven high at 0.00ns 
[0100. lu 0:00.0s 50k] 

1 critical -g 1 tph 1 a 
[0100. 0u 0:00. Is 55k] 

: clear 

[0:00. 0u 0:00.0s 55k] 

1 delay phib 0 -1 
(221 stages examined.) 

[0:00. 8u 0:00.0s 60k] 

1 critical 

Node st Is driven low at 135.82ns 

...through fet at (911, 583) to GND after 
373 Is driven high at 133.89ns 

...through fet at (893, 510) to Vdd after 
398 Is driven low at 131.02ns 

...through fet at (866, 570) to GND after 
364 Is driven high at 123.52ns 

...through fet at (877, 510) to Vdd after 

76 Is driven low at 108.50ns 

...through fet at (584, 411) to 88 

...through fet at (478, 435) to 201 

...through fet at (472, 415) to GND after 

190 Is driven high at 15.03ns 

...through fet at (479, 406) to 163 

...through fet at (666, 930) to Vdd after 

181 Is driven high at 4.91ns 

...through fet at (541, 930) to Vdd after 



535 Is driven low at 0.75ns 

...through fet at (490, 942) to GND after 
phib Is driven high at 0.00ns 
[0:00. lu 0:00.1s 60k ] 

: critical -g 1 tph 1 b 
[0:00. lu 0:00.0s 65k] 



Crystal Analysis of PLA Light Controller Chip 



Script started on Thu Jun 13 23:30t02 1985 
X powest -p < It, Sim 

amma«0.4V*’*,5, tox-9e-08m. u0=0 . 08m**2 /V-s 



vdd=5V, 


, vtd*-3.5V, 


#dev 3 


Pdc_avg (W) 


0 


0 . 000000 


20 


0.011980 


15 


0,030536 


35 


0.042516 


X 

scr 1 pt 


done on Thu 



0.8V, vsb=2V 
Pdc_max (W) 

0.000000 
0,023959 
0.061072 

0.085032 

13 23:31:12 1985 



type 

enhancement pul Tups 
depletion pullups 
special depletion pu 

TOTAL 



Powest Analysis 
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X/vl 5 t /berk 85 /b !n/crystal stop . s 1m 
5 1 nputs c 1 1 ts rst 
loutputs St hl0 hll f10 fll 



: set 1 ph t a ph 1 c 
; del ay ph 1 b 0 - 1 
: cr 1 1 1 ca 1 
: c 1 ear 


<9. 6ns) 


: set 1 ph 1 a 
: del ay ph 1b - 1 0 
•.delay phic -1 0 
: cr 1 1 1 ca 1 
: c 1 ear 


(56.67ns) 


: set 0 ph 1 b phic 
: de 1 ay ph 1 a - 1 0 
: cr 1 1 1 ca 1 
: c 1 ear 


( 17. 55ns) 


:set 0 phib phic 
: de 1 ay ph 1 a 0 - 1 
: cr 1 1 1 ca 1 
: c 1 ear 


(54.63ns) 


: set 1 ph 1 a 
: set 0 phib 
1 delay phic 0 -1 
: cr 1 1 1 ca 1 
: qu 1 1 


<363.52ns'^) 



Crystal Command File for MacF'itts Light Controller Chip 



APPENDIX D 



CHAPTER VI LISTINGS 



Statistic - for project haml5.4 

Statistic - options: (herald opt-d opt-c stat obj clf nologo) 

Herald - 59, 52 - Reading source file - ham 1 5 . 4 . mac 

Herald - 78 , 52 - Reading library from - / v 1 s 1 /ma cp 1 1 / 1 1 b r ar y 

Herald - 890, 591 - Processing definitions 

Herald - 894, 591 - Evaluating evals 

Herald - 980, 591 - Expanding macros 

Herald - 2822, 1405 - Extracting sources 

Herald - 2982, 1511 - Extracting destinations 

Herald - 3015, 1511 - Extracting labels 

Herald - 3015, 1627 - Extracting sequencers 

Herald - 3131, 1627 - Extracting flags, data-path, control, and pins 

Statistic - Maximum control depth Is 7 

Statistic - Number of gates Is 140 

Statistic - Data-path has 0 Units 

Herald - 9964, 4968 - Outputing .obJ file 

Herald ~ 10373, 4968 - Extruding gates 

Statistic - Control has 155 columns 

Herald - 586415, 233036 - Extruding straps 

Statistic - Circuit has 715 transistors 

Statistic - Control has 42 tracks 

Statistic - Power consumption Is 0.160860 Watts 

Herald - 589965, 234452 - Laying out data-path 

Statistic - Data-path Internal bus uses 0 tracks 

Herald - 589967, 234452 - Laying out control 

Herald - 599196, 239812 - Laying out flags 

Herald - 599197, 239812 - Laying out river 

Herald - 599206, 239812 - Laying out wing 

Herald - 599281, 239812 - Laying out skeleton 

Herald - 599325, 239812 - Laying out pins 

Statistic - Dimensions are 5.137500 mm by 4.005000 mm 

Herald - 606259, 242522 - Outputing .clf file 

Statistic - Memory used - 529K 

Statistic - Compilation took 168.593902 CPU minutes 
Statistic - Garbage collection took 67.456947 CPU minutes 
Statistic - For a total of 1805 garbage collections 



Haml5dc - scr 
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